Unikernels

1 0

right I've already tweeted a link to a blog post that I put out because I change the title of the talk as I just told as I stay because after chatting to some people earlier this week I think it's much more useful for me to focus on the one of the pieces of the personal cloud stack that I've that I wanted that we're working on so i'll be talking about uni kennels and other slides already up at that link and something I'll be asking you to do is to send me feedback about this talk at the end and I'll share with you a link then so you can go to a web page and just essentially rate the talk and tell me what you thought about it so just overview this is roughly where I the size of this is roughly where I expect to spend most the time talk about unique URLs a little bit about at the end about why I care about this and some other deployments and a little bit about me I am a program manager project manager I think about product stuff in a group called gamma labs in a university of cambridge in the computer science department i do a lot of community stuff for mirage or west which the project i'm going to talk about mostly i heard cats-- i like systemd stuff my background covers physics i were used to be a neuroscientist and now kind of pretending to be a computer scientist and i've worked in startups and big companies so if you think about software today essentially it's very simply it's an application and it sits on top of an operating system but it's not just that we build this stuff locally on machines that you have with you but these days you deploy it remotely so the environments that you're pushing these things into is quite different there's lots of complexity in between and so that means you end up needing far more tools to make this work properly and consistently and reliably so software is complex but the funny thing is most those applications that we're shipping a single purpose when we push things out of the cloud they essentially just do one thing and complexity here is the enemy so when things are more complex you think you have more layers that makes the configuration tricky all that duplication of that that same essential stack everywhere is inefficient sizes of the things we're pushing around means that it takes a while for those to boot and just generally the most stuff you have the most stuff can go wrong and the more places the larger attack surface is going to be and there's just another question underneath all this is why are we building software for the cloud the same way that we did it for desktops it actually doesn't make sense if you if you dig down and think about it so can we do better than this and course the answer is yes we can so here are the things we essentially need to try and do is disentangle the applications from the operating systems themselves can think of these things separately break up all that operating system functionality into small module modular libraries so that you can compose them only pulling the bits of that the system functionality that your application actually needs ignore the stuff that it doesn't need and also make it really easy to target multiple platforms using the same code base and so this is where you know kernels come in and quick definition of unique URLs is essentially they are specialized virtual machine images they're built using a modular stack where you can you pull in all the system libraries and the configuration as long with your application code and you produce this single purpose virtual machine with a single address space another way to think of it is every application is compiled into its own specialized OS that runs on the cloud or an embedded devices and I'm specifically going to be talking about Mirage OS so Mirage is written in a camel so that's a strongly statically typed language essentially one of the good things about a camel is it has a very very good module system and a very good type inference engine and what we've had to do here is actually rewrite a lot of the protocols like TCP and a whole bunch of other things in pure a camel that's taken many years of work but it's been worth it and that essentially is what makes producing a eunuch colonel using a camel code possible this is essentially how it works so you take this giant stack of stuff you've run it through the compiler and you produce that small unique hurdle small pile of stuff the end which is the unique kernel and you can target this to essentially different backends the important thing is the development cycle is quite familiar to you but the deployment sarios are still quite broad what that means is you don't have to use the entirety of the of the tool stack you can just use you can develop your logic using the native stuff available to you on your unix based machines so all the networking stuff that's available just that comes with your OS just with a Rachel application code on top then you can swap out the the native eunuch stuff and use the Mirage aware system libraries the ones that we've rewritten and that allows you to test your code and then you can think about deployment and this is just a command-line flag essentially saying not deploying on UNIX now now flowing on Zen and same code runs and it targets the Xen hypervisor which can be running on either x86 so this is the public cloud that we have today or xenon arm which is embedded devices for the future so the same code base so what this all kind of sounds cool a lot of this work done akademie for a while and one of the things i want to get crosses these this method of creating a deploying software is very good because the the binaries that you produce the virtual machines they're really really small and that lends its a lot of benefits and they're also because there's more the attack surface is smaller so a lot of things become possible with this approach that are not a possible using the traditional stacks and the example I'm going to use to try and lead you through this is that of static websites everyone know what a site website is cool the reason for this is because it's easy to understand i use jekyll for my web site so i'm going to talk so we can think about how we take Jekyll and turn that in Jeff a website and turn that into a eunuch colonel but the process is the same for any of application that you write it's important to bear that in mind so these are a bunch of static websites that's my blog post which is live on the left on the top right you have the Mirage website and this is another one of the contributor websites and that's a something called Bitcoin pinata which is another interesting static website using the new TLS stack so let's focus on this one first so I've said it they're small how small anyone want to take a guess so this is a unique hurdle which is only using the bits of the networking stuff that are required to essentially serve some traffic over the internet and it holds a bunch of bitcoins how big do you think the equivalent is in the traditional stack and we'd want to guess how much at the other side of the screen am I going to take up close it's 200 mix so that is about 25 times bigger than this but they're both doing the same thing the really important thing is this one on the Left doesn't have any extra stuff there's no shell here there's nothing to log into there's nothing to break into it to manipulate or to pivot around that on the other side could have all kinds of stuff that's there in a stack you can't hold all that information in your head so you don't actually know what's running in there so if you're running trying to run something secure which this Bitcoin pinata is that attack surface is huge just a fun that was planet Earth what would this other one be nada it's Pluto I'm kind of cheating a bit because I'm just considering the area of the circle there but that's an example how much smaller these things are than the traditional stack so easy today but with the same functionality that that's important and this is all these are also really easy to deploy because they're so small you can check the entire thing in to get and who users get show of hands ok so the workflow that you're familiar with get in terms of get push get check out all of that stuff or becomes available to you with that the thing you're checking in is the virtual machine it is the application and that means you can get to heroic of eunuch an old in about 100 lines of code actually a bit more now because the rest of the developed team added a bunch of extra stuff so I'm going to walk you through an example of what this looks like using a union colonel so now I'm going to switch and if you any of you read my blog post already realized i have not prepared this i was writing this talk yesterday so this bar is live we'll see how this goes so going to first so this is the post so just let you know that you don't actually need to go all the links are here if you want to follow them essentially we're going to build a static website we're going to take a Jekyll website and turn it into a unique colonel and helpfully we've got Mirage skeleton so Mirage skeleton has a whole bunch of examples so there's a dns server a bunch of stuff here static website we also have a secure static website but i'm just going to look at static website so we have two files that we need to use their config and dispatch i'm not going to dive into the code i'll just show you show you what it looks like synkit idea if i work so you can configure this by pulls in things from the environment to figure out what instructions are giving it for example direct and sockets for what networking stuff you're going to use so i won't dwell on that too much it's sent now for my website which is a jackal website i'm using github pages i just added an extra directory under scroll mirage I've just dumped those two files in here and I'm glad to see that screen and if you minded very minor change in here and does anyone use Travis before so Travis is a CIA system so we've hook this up to Travis as well so this is what the build script looks like this is fairly straightforward so we're getting in a camel environment a bunch of extra stuff and then here is it here are the commands that we actually use to build oops make very simple and then the rest just happens and there's also a deployment stuff we'll talk about and so travis has built this is the the post i pushed earlier so it built it went live this has also now been checked in to get my site is actually still running on github pages for the moment cuz I haven't finished it will change so what we're going to look at now is the is the mirage website yes the mirage website which is here so the first thing I'm going to do is show you talk you through the the whole tool chain using the eunuchs and then moving on to moving on to how we deploy descent so there's nothing worth showing locally at the moment so now i'm going to switch to hear can you see that ok so we are here and we're in the source directory of the mirage website looks like this there's a lot more stuff going on here i can show you what the config file looks like quite a bit more complicated this is one of the first sites that we've built as essentially dog food our stuff and so we're going to first build locally configure unix that's all it takes I have the entire a camel toolchain the appropriate tool chain already installed but I actually want to run this using the net the eunuch stack so I'm going to do this that should do some stuff that worked that also generated a file we can look at see what that fall looks like bunch of stuff in here I'm not going to go on this either so this essentially is more information about the configuration for that unique Colonel all this is all this will get compiled into the unit colonel now we just make and we sit and wait and twiddle thumbs and that's it that's now built a eunuch colonel it's here now we can see what then we can run that now and it will run the it will run locally and do some stuff and now we should be able to see this here yeah so this is a site now running locally so you can see the model is fairly straightforward to write application you can test things locally you can it will all work and now we all want to deploy that out to somewhere else so what i could do is if i wanted to take the same code base and now run it compile it for the cloud i would do this but I don't want do that on my machine here I will actually want to do that through a whole tool chain so this is where we hook into Travis so this is the actual website for the Mirage whoops does not support our so when we push here what we want is Travis to pick this up and do all that build for us so build them are the Zen unikkatil at the end so this is the last push that happened ignore the little red thing there this was last successful build and if we follow all this this all the way to the end you'll notice that we've also got a part where it finds a deployment repo and pushes the built unit colonel into that repo this is that repo and so we can look at the size of this unique colonel now if I have it here so those are the two images so it's pushed the zip image which is as you can see I've already been trying this out so the compressed image is 4.6 megabytes that's what's checked in to get and the uncompressed image that we going to run is about 15 megabytes and that's the entirety of the site that includes all the stuff that's needed to serve all the traffic the web server the networking start everything that's the entirety of the of the Uni kernel and that gets deployed on to a machine and so we have a whole bunch of scripts that essentially live in the deployment repo we have another machine that we host on bite mark that essentially pulls from this repo looks at what's changed and then turns off the old virtual machine and turns on the new one and then you end up with a deployment that goes from what's happened on your machine to all the way up on the live site and that's an end to end torturing and we only have to add about 100 250 extra lines of code which is basically a bunch of scripts to glue the things together because we have a very simple a very small simple thing that we need to push around that'll make sense some nodding okay so that's essentially what the that's essentially what the the procedure is for building a unique in all and pushing it out live and that workflow is general you can apply it to any kind of application that you're building it doesn't have to be for a static website this is just a simple example to get your head around so there are other ones that have been deployed so the Bitcoin pinata is a really interesting one because we built members of team built a complete implementation of TLS in pura camel what that means is we have we can now build secure uni kennels and that's essentially what they did so they took a bunch of bitcoins put them into a eunuch he'll put it on the internet and said break up stuff that's been up since about februari and the bitcoin is still there so assume that's a good thing we can't prove security by the absence of anyone hacking into it but it does provide some faith in the rest of the stack because that that banana has been bombarded and it has stayed up other other clever things you can do our this slightly crazy stuff like a unique erna of a virtual machine for every URL so every URL on your website you can write by writing a short script so Magnus did this with about 80 lines of Python spin up a unique kernel with a full stack for every single URL right down to the the small PNG file for the RSS feed was being served from its own virtual machine I don't know why you want to do that but it was cool and just try imagining trying to do that using the traditional full stacks so this kind of experimentation is only possible when you've got what when you take this kind of approach and other people have built other simple HTTP rest services and we also have a bunch of other tools as well related to data storage of course there's a trade-off everything here is written in a camel so if you want to use this stuff for now you have to write all your things using using a camel but that's only for the time being so in the long term we know that we need to think about how legacy applications will work with this as well and so now can get onto why I actually care about this and this is what the talk was originally going to be about but I want to give you more of an overview of unit kernels first so essentially i want to help empower individuals to own their own piece of the internet and in order for that work we need to have distributed personal clouds you can't really no one wants to be a sysadmin for just running their own infrastructure and so to achieve this we need resilient scalable systems and that's essentially why I care about this so I use a whole bunch of different services and this is great as a developer and also as an end user but this is much more like feudal computing because when these things change when the when the services changed their rules when essentially after a while you start to feel like your cattle being grazed for the data that you're producing because there's always something new coming up there's always someone else who wants to access your contacts or you're something else and what value am I really getting from that anymore so this is there's a sense of slight sense of malaise and it's sad so the only way that I see out of this is to make it possible for every individual to start owning their own piece of that internet but without having to become a sysadmin so we need the reliable secure scalable infrastructure to make that possible and so that's what this stack this this stack is about so mirage is one piece of it which is how to build the application how to deploy the entire lifecycle management of the software omen is a whole is another piece which is has a whole talk or all of its own on data storage so it speaks fluent get and signpost is a new project which is about how you've form connections between all your devices so so i can talk from my phone to my laptop over the local network will happen to like route through the cloud and essentially the first few applications to target will be mail contacts and calendars because they are the core of everybody's infrastructure if i don't trust this stuff to run my email i'm probably not going to trust it full stop so we do have an IMAP implementation and smtp I'm not using it yet it's still in alpha but it kind of works so there's a pathway for us to get from the the tools that we're using now the tools like this which you can run for yourself and another reason this matters is because there are way more devices like this coming that are going to be arriving the next few years this is the whole wave of internet things and a lot of these things don't need a cloud back-end they just need something quick to help form a connection between two endpoints on the network and then do some stuff and that stuff can actually happen locally does anyone heard of the good night lamp a few hands so just described it quickly you buy that big box you give your friends and family the little box when you turn the light off and on on the big box it turns off and on the little boxes that's it so that's quite neat if you have family all around the world because they know when you're home and it's just fun but that needs a back in service to perform to perform the identity from the connectivity and pushing little bits of information around and that doesn't have to be a cloud service if they end up going out of business all those little light boxes in people's houses they don't work anymore and nest for example is an interesting one because you have a thermostat what's that and think about what the life cycle is of a thermostat in your house how long will that actually be there but then people are faced with issues like this and this was a tweet a couple of days ago so what's my option if I don't like the new license agreement change my thermostat do I really want to do that and so that's why I think we need to end up move more towards a world where essentially people are able to take control of their own infrastructure without have to give up everything to a third party so you can contribute to this stuff everything I've talked about is all fully open source Mirage daioh is the core of the infrastructure that helps us build a unique knows there are also other you Nicole implementations out there in other languages so just off the top my head Hal vm is using haskell link is Erlang and there's a bunch of others as well not is the website where we're aiming to take the unicameral approach and apply it to personal clouds and a camel is the language that we're building all of this stuff in and I would really like your feedback so I've put up a page I'm a charger calm / feedback if you go there you'll see just two simple questions rate the talk and would you recommend it to a friend and a little box if you want to tell me anything else I would really really like to know what you thought both about what I talked about and just anything else you want to say oops and I finished earlier than I thought I was right about over running so we have time for questions so you have a choice of whether you want to use the networking stack that is part of your opportunities Amun der Neath the UNIX networking stack or whether you want to start using the the libraries that we've built so the example so for example if I want to use the networking that's there on my Mac I can I can target unix but use the UNIX networking infrastructure but if I want to use the libraries that we've built the TCP stack and the other the other parts of the of the tool chain then you just you can use view 1 i'm waffling now yeah you can either use the library's we've built or use the libraries are available on unix you get to choose so the reason for doing that is you develop your application logic using the stuff that you know already works and then compile it again using the system libraries that wig we written just to check that everything still working did that answer your question so I'm I totally agree that that the cloud should be given back to the users but can you explain again how how unique kernels do that better than traditional operating systems because they're so small and easy to and easy to manage it makes it well as firstly there's security benefits so unique and also inherently more school because he attack surface is much much smaller and there's a whole class of bugs that won't be part of them because of the way that developed and if you can make things much much smaller you can deploy them to more devices so imagine taking the eunuch an old and employing them to embedded so you can run such services in your home and if they're just as perform they do the same job as something else then it's a lot easier to keep them up to date because you simply start a new virtual machine if then if there's an update rather than have to worry about patches and update and essentially becoming a sysadmin so the idea is to avoid becoming a sysadmin so you can either run this yourself or I can trust someone else to run it for me but in either way because i'm using fewer resources the attack surface is smaller there are benefits in terms of cost and security yes so X I didn't understand exactly the the principle is to run on the bare machine right on under you you run the Uni colonel on the basically a bear processor right I'll be running on then yeah so I perhaps I wasn't clear so Zen is a hypervisor so it's a virtual it provides virtualized interface so we target Zen and then equally learn on qemu or something but you couldn't clock that into a queue or really think we've done some work in that area but for a moment we're just focusing on mirage sorry I'm then would you like the technical question or the political question I'll try the technical one first you talked about checking in the built blobs in to get I wouldn't want to clone that repo that's a good question and I literally did do that to see the sizes of that of the of the unique another Mirage you can do shallow clones so you can have to clone the entire thing so I just flown literally the last commit cool and the political question there's there's this dynamic where all of the cloud services hosts all of my data so Google owns my email I have a slight hedge against that but there it's in Google's interest to keep owning all of my stuff it's not in nests interest to make a thermostat that makes it easy for you to swap other software in do you foresee a world where people make lots of things that put people's own control of their data as a priority I question that way hey okay yeah the entry questionnaire is about business models because that's another thing that I didn't talk about now that also has to happen is we need new business models for essentially ways of making money on the internet that don't involve harvesting loads of data that's that that will essentially solve that part of the problem because you can't ignore people who other businesses who want to make money so the idea is if you can build services like this that is more if you can charge for something else perhaps charge for the service and it's meted much much lower and just have more people using it there's maybe there may be another business model in there and that's where the challenges from the from the other side is depth there has to be a way of making money because if there isn't a way of making money it won't get adopted how does the performance of your system libraries compared to more mature operating systems as good so we've done there are papers that if you look at the mirage website they will be linked to so for example we did a dns server we got that down to a few hundred megabytes and we compared the performance of that with I think bind and and on another one and the performance is the throughput is just as good so you're not taking a hit in terms of how useful this these tools are but you're getting all the benefits is having something way way smaller and way more resilient so how how are you how are you doing persistent storage since right now you're having static images so the persistent storage is part the story I didn't talk about gnats erman which is a a get like data store so it speaks fluent get and you can treat it as a pen as an append only store so you can push data into that and use that to persist data I didn't really talk about that too much because that's quite involved but that will be this that's the story for how we do data persistence I'll have to talk to you afterwards but yeah it's it speaks fluent get so you can expect I don't necessary know all the details so it's not it's not a database so it's not so you can interact at you with using all the get tools that you already have so it will it's does that make sense oh no it might be easier if I show you a short a quick demo so there's a video I can show you I rmin so if we search for mirage erman you'll probably go to go to the blog post that where we describe all of it also I don't know that much about it yet it's still very new so it kind of an extension of that question about the new business models because we need to run it somewhere right so we can be the wrong kind i secured think for ourselves but we would need to run it somewhere so we need to buy service from somebody we would need to trust them and they would need to have some kind of good business model because it like the language is a small so I don't know how they don't like money in terms of their already ridiculously cheap cloud storages like this five-dollar out so do you have any ideas on your blog or can you direct somewhere where can read about it like the new business models and the ideas of how to deploy all that in a secure trustable way somewhere so they need to run my own hardware service there are a day so there's another project we're working on with a collection for the universities in terms of looking at business models for for example personal later so if you make it easy for logic people to capture information about themselves can they monetize that data somehow in that scenario there all the stuff is staying within the home because technically that's easier for them to build but in the long term they're also think about how you'd build such a system in a distributed way so I can like if you want to search for that your search for hat project I should add that link to the to the blog post as well but there are people considering business models to in last one no okay thanks sanction gonna make a 10 minutes break and after that we got Nikita with talk programming web UI with database in the browser