Integrating Some Rust in VLC Media Player

0 0

hi-yah much better so yeah I've been on the quest this past few years on trying to replace some C code and this is what I want to talk to you about and also that Mac sorry shitty platforms okay yeah technical issues so my name is reformed service company I'm really happy to say we have rose in production right now that I did not write it's like some colleagues just say we have - we don't like bash let's write some rust and then they talk to me okay we're replacing some version we have issues so what I want to talk to you about today is another French project that you may have heard about the one with the cone so we will see Medallia is very interesting project the basic idea is whatever the format whatever protocol you throw at it it should just work like the basic usage is you get a video file and you drag and drop on the software and should just play most of the time it works but it comes with a catch we got a lot of issues a lot of variability issues caused by memory buffer overflows that kind of thing memory leaks crashes double freeze anything so very common issues that we would like to solve they come from well we write everything in C because all software there was no at the time every format is passed manually like handwritten passes in C manual memory management and like the format are all very weird we're saying that in video everybody has an opinion on how a video format should be and they're all wrong like you have specifications that are very unclear that may some choice that can appear abuse because the the one who wrote the specification just wanted to reuse some old code from before and then people interpret in different wrong ways and they have multiple competing implementation of the same format and you have to support all of them because there's that one big adult stuff that makes those video and it's done in a bad way and there's this over one and you have to play everything so you have lots of different cases memory stuff and it's all in C it's a nightmare so like two years ago I set out to try to find some way to fix that so I had some requirements it should be easy to write the password correctly because basically it's it's a nightmare to maintain that kind of code it's nightmare to test there should be a way it should be memory safe it should be easy to embed in C like one of my first suggestion was okay let's do some Haskell and then they took a look at the runtime and said okay in a way no garbage collection we don't want that inside VLC no way okay so maybe with that we see you see where I'm coming there's this nicely turned language that came that started the testing two years ago it was realized all of those fields and arrow bars everywhere and the compiler was breaking everything every two weeks it was very very nice language and it really kept me on my toes because I would write my code all the time I really am NOT giving to the language team I know it had to evolve and got a long way from that time and really amazing work I just on the side I did something fun that maybe some of you saw at the time it was a project called rustics which was basically a small Python script that would I could point it to a github project it would clone the project try to build it build with the last compiler version it tried to fix the code with some reg X very very ugly and then push and pull requests very funny to do because people got to request wait wait how did you do that so fast so with that we now have a language we can use for writing some code in VNC that's memory safe but is it enough to write pass okay so a few of you may have written passes already manually in so that's still not very easy so I started to walk project that's called Nam yeah I call it known because a lot of great things I can make like it it's your data byte by byte so none is a passer community halls library is just a technique that's based on simple deterministic functions get an input it generates an output that's it and you combine them in ways like you do one and then another one or you alternate between different until you get one that passes correctly and its returned with macros lots of macros very very large macros so why because in 2014 I'll try to do stuff that would have required the import rate that we're getting right now and it was a mess so I said let's write macros and should be a good idea and basically it works so the design is as I said very simple we have functions that take an input and reaches an output type that selenium the enemy can be incomplete to say ok we need more data it can be an arrow with additional ideas like a word input but which part of the input showed radio or you can have something that will return the output value and the remaining part of the input that way you can just get the done part of the UI result and continue from the input it's a very very simple design and it uses macro okay so this one is simple enough we have named which is just critical function we have terminated which is a Combinator that takes two parcels applies them one after the other and takes the result from the first so the alpha parser will recognize alphabetic characters and it must passing recognized string but we must be terminated by digit okay when it's generated it's a bit hairy but similar enough so you have a function with signature like we saw before and it's just much the result of that passing on that input you get a result you get the remaining input you match on that and then again and I return done with the remaining input and the output of the first most most of non code is just much stuff everywhere it's very very simple very damp code I choose to really make it easy to in fact it's not generated like that it's more like this but because reasons because you need to have like a full path somewhere because it must be imported correctly but it's it's real manager just quickly the features I have it can work on strings and byte slices on bit arrays I took a lot of inspiration from parsec it can use the regex crate there's no syntax extension it works that's what on rust table for for two years no as long as there's been the rest table basically can be as fast as and we have some really nice stuff with error management like you can no can do a hex dump and so which part of the impute corresponding to which Combinator it's a gimmick but it's very very nice to see and coming soon zero so there's a wide space limited format Combinator which is implemented like in the Dom best way you could think of you apply the ws Combinator on your passing tree and it will just interspace the space community everywhere inside the macros it looks very ugly but it works really really well so performance gain and everything coming soon so it should be really nice to use so we have the language we have the passing library it's about time we got to work so yeah because like until I started working with rust and got really into VLC God came to me and okay when are you done not started yet okay let's get to it so first let's see how it works if you'll see like most media application is just a pipeline you get data that comes in put in the access module which is like HTTP FTP file access everything so it gets data from somewhere it passes that to Adam Excel which is basically where we're going to walk Adam X is a parser that will extract the video and audio stream and subtitle streams and pass them to other parts of the software so the decoders the filters that will apply on video and then goes to the output or it's Ryan Cody and put into another format for transcoding to file to the network it's always a pipeline all media media software walks leather because basically you just pushing stuff at the end of each over and the big issue you have in there is how you synchronize everything because you have the audio and video streams decoded and the filter and everything at the same time and you want to make sure that the audio and the video quickly synchronize because otherwise it's really annoying okay so the way it works in VLC is that you have client applications like VLC media player or LMC which is a movie editing application the calling to live VLC which is a public API for all the things which is the big stuff that managed everything like that's the shredding the synchronization the module stuff like all the api's how you access files on different platforms everything and where we work is there in one of the modules so the one we saw in the previous slide everything in there is a module they all link to live VLC code to get data but the live vehicle loads them and try to do stuff with them so if you want to integrate something in VLC we have to make dynamic library alright it should be alright let's see how it works when you start the module libyans go we just look in the folder and see a lot of libraries try to load them see them they have one symbol in them call one of those functions which is VLC and three version and then the module will just say okay I am this module I can do that and here are the callbacks you can use to talk to me okay so now I think we get a bit hairy to do that universe because I have to emulate a lot of stuff that's very specific so how do we integrate in a roast project 3 how do we integrate a respect in a see project where we are not in a self-contained project like if I want to rewrite a library in rust I can more or less easily make a completely compatible capi like you can just drop in the dll and it should work correctly but there we have something where you call C code and your code by C code and it is interspersed all the way and it's really annoying to you but maybe we'll be able to use that so we have a plan first we need as all we write projects we need to import some stuff from the sea like the structures and everything the functions we will need to use then we need to make ourselves pass as the VLC module then actually write the passer so we choose an FLV passer which is a kind of easy format to pass I could have frozen some very annoying stuff like mp4 but I really didn't have the courage and then we could actually start to pass stuff so reproducing structures maybe some of you walked with benjin and that kind of thing so I really wanted to use it but I couldn't because the C code in VLC is a kind of object like structure so you see the VLC common member stuff it's a macro that's just expanding to some common attributes of of a structure and there's also the Union which was I don't know why not well supported I've not checked if it works right no but well we cannot generate our structures in rust automatically let's just write them manually it's just brute force take some time write everything and you can see that you have to convert everything so you have a constant thing somewhere you need to transform to mute where you need its minier walk it's something that really should be easy to automate but I think we have when we write stuff we have some time to before we get very very automated stuff but it's alright that it's something that I can do manually then we start importing the functions from analytical again this is easy enough I think I could have automated part of it but really since I only need like also functions it was probably quite easy to do this is where we can start to get smarter because we have this function but we don't really don't want you to call them directly and interact with C code like we don't want to be a C developer in rust so we make safer wrappers so you know there's a big and safe in there it's not in the function definition you know there's something safe happening somewhere what you want is to have the guarantees everywhere else in your code and this is how I wrote that so you start importing stuff then okay let's make a module so yeah you would say again macros and in C so this is how VLC module is generated basically it's a macro that will let you declare some stuff it expands them something like that it's very annoying to write basically VLC gives you a call back and you call that call back again and again and again and again we've already got the callbacks the function open and close that you oh yeah I have to want I have to write this in rust so this is the point I came up to JB the video on project leader and say hey can I just make C code that will call the roll code because it would be easier that way look at me and say okay no it will be less fun that way so yeah it's specific kind of fun let's write everything manually yeah so few annoying things we have to do as stuff everywhere we have to have string data not terminated and binary strings and it's a lot of fun doing stuff that's unsafe everywhere because we got to call back and I think again it's something within a brute force with enough time you get it working I did not get it working like in five minutes to see how I could get it like some of the strings did not get quite when loading what so right maybe you can do better yeah macros okay this was very small one very easy one I wrote last week which is basically doing the same code there's an interesting path there you have to pass the function name in the C code the macro is just expanding the function name manually but since concat I don't does not work exactly all right okay so now we have something that actually loads in VLC it took some time like to get there sick could be a bit hard so no let's try this begin passing some stuff so FLV simple enough format this first header you with begins with the tag F L and V then a version number a bite another bite with flags indicating if you have audio video and everything and there's an offset that shows where you should start passing the rest of the data it's interesting because that means you can put it very far and just hide data before the packets it's something I don't know why they do that in some video format but ruphylin sometimes so and this this is where we see that it's annoying to be called by C code and course you got instead of having a completely self-contained project because we have a function that must pass as a C function we want to write good Russ code inside but we still have to write do to call oversee functions everywhere so it's a bit the code the code has mixed feelings like you want to write safe good card but we don't really have good tools to do that except by making really really great wrappers for the the C function but again it's something that's manageable here the first thing we do is we peek at the data because we're sick oh we call every module and say okay try to pass and tell me if it's all right and if it's all right you will be the module that will be used otherwise you give do they type something years so you let another module take over so this is an interesting part we don't own the data right this was a big design decision it's made to walk on byte slices on immutable bad sizes because most of the time you don't own the data that will be passed so if you want to be a good citizen the C word you have to assume that you will not manage the memory you can try to take over a lot of things but at some point you have to make compromises well it's all right so we call our passer in the drizzle returns done with the header and then we got the offset saying okay and we seek until the offset and then we start passing we give some functions that will be called by the value vertical and we store some data we want to use okay it was enough then you have to continue the data like FLV is basically a lot of packets you have something indicating the type of packet the size the timestamp because you want to synchronize them and the stream ID saying okay this idea was the audio stream this ID is the video stream when you start passing those again you have to pass as a CE function and you do Harry stuff at the beginning to have like cleaner code afterwards we read some data okay this is an interesting part with none with which I can do a hex dump anywhere if I owe something to just see what happens it's very very useful when you do that kind of project and then you match on your header and try to get audio data and again calling C code everywhere is just a complete mess here is the part where we ask VLC to create a block with the data you know that you have that much data that's not your rock and you tell C okay you will pass that block to the rest of the code and there we have some very very uncool hack there's a function that takes V a list a list is a way in C to convert a variable number of arguments you got in your function to some kind of viable and it's very specific to C compilers and it's very hard to to support correctly in worst and it's a very big big hug in that's available in the VA least sorry in the NDA least crate but most of the time you have to use hacks like that and again isolate unsafe an in part and keep the nice code anywhere so maybe I should show that if that really works because otherwise you say I'm just bullshitting everybody okay so where is it where is the screen no oh yeah that's way okay so is it big enough yeah it should so first we will do a cargo build yeah so you see nam you see the Elise you see flavors everything so now there's a small bar script I use because I need to modify slightly the DLL to point to the right one and the I don't think it's something we can do already with cargo but since there's already tool to do that it's right okay so now I have compiled my dll inside VLC and I have another script to start it and yeah we have some kind of video walking she's very old advertisement it's very hard to find good samples in FLV because it's such an old format and you can see that I really like logging stuff all the time so right okay so we got something walking but I walked a bit in isolation I make my dll and just copy that into VLC and yeah it works now can we get that actually into the tree okay yeah so basically this is the biggest issue you will get in any project you want to rewrite anything in rust because as I said every opinion and it's made mostly mainly for one language and cargo is the same cargo wants to build rust and wants to manage everything the auto tools want to read whatever I don't know and they manage everything but we can still be good citizens and try to do it correctly so first auto Kampf oh we have to check that cargo and rust out there all right there's something that's in the configure script okay easy enough I said that but I'm did not write it I'm really not another tools guy the funny part is this one which took like an entire day to get it working basically at the when you build C code you make object files and at some point there's little littles knows how to build dynamic libraries for the part platform you're targeting and it takes the object files and make strawberries but Cargill knows how to make dynamic calories and wants to tell it to that I know how to do it but little Winston so what we do is kind of hack we make an object file you don't see but emits object something and then we give that to Natur and it should mostly work it's very hard to be a good citizen in someone else's bill system and it's an issue you will get over and over and over in that kind of project I have another project where I did work on mobile and getting cargo and everything to work with cocoa pods and making libraries in all the platforms we need was really a pain and getting a bit code walking is just no way so here's how a rewrite project works you have to spend time on the build system and like here III could I could just walk in the isolation before but most of the time you will have to work on that from the beginning because this is where you have will have all the ergonomic issues because if every time you will need to work you need to copy stuff and it manually and it will take too much time you need to have everything automated second you need to isolate the same unsafe api's because you know you're interacting with C and at some point there will be some unsafe stuff but you have to trust that the C code will do what it does and that does it correctly most of the time it won't but you have no choice basically in the matter you have to be again a good citizen the passer is the easiest part right it took a few hours to write correctly but integrating int in the rest that's where it gets a bit hairy and very very important we don't own I say everything but we don't own anything the data is in the C code the pointers everything they passed by the C code the callbacks we have to play by the rules and this is still a bit it's not gonna make yet but it's getting better like we can make safe and useful wrappers like if you take a look at the rest of Ellison it got very easy to use like with clothes you using it's getting really nice so from there this was a prototype it's not actually in VLC yet so don't don't announce to everybody yay Dillon is doing some rest no it's not there yet I take some time like Firefox I know it took some time before it got actor into the tree and like lots of bridge system issue and everything downloading dependencies is something that I could really use I tested a bit kappa vendor basically for VLC we have this big archive for Windows and Mac OS well you free download libraries because you don't want people to rebuild everything all the time and then I need to complete the binding I the important everything is now separate crate that can be imported in the new project to make new VLC modules if anyone anyone wants to test it so just the thanks to the people that helped us this may be someone you know because he had on documentation and kind of thing for the rest team and on the passer and writing a plugin and trying to make the early code I wrote better because I really didn't care about in the indentation and the kind of thing and look up a battle really he did all of the auto to stuff so doing a rewrite it's hard but right now in rust is doable it's something we can start to do and like almost anywhere you take some C code you start replacing and see if it works and then you try to convince the C developers that you will remove the code let's see how it goes nom nom if you want it the slide are not there yet here's the FLV parser the helper library to write module and test code a lot so if you have any question if you want to try that kind of thing should otherwise thank you for asleep