The Modern Prometheus

17 1

[Music] [Music] if any speaker tells you ever that they're not nervous they're probably lying can I'm really always super nervous at the beginning of my talk so I try to kind of offload some of the nervousness to you there is a saying among speakers that good talk starts with audience participation I don't really like it when I'm in the audience so I won't put you through this but just in case how many people have been to a talk that started with the raise of hands thanks dot so um I'm Kate I'm kind of technically officially polish but my mother was born in Silesia and my paternal great-grandfather was also born in Silesia that puts like over half of my genes a little bit over on this legend side and my mother is from some computation of scale which is a solution CD that if you're into European history probably know that wasn't called this way before it's a control war and it's been from cavitational scale for the last seven years but for the previous seven hundred years I'd heard a different name it was a city that was created at the crossroads between the road suave so Breslau and plug on one one one hand and nice kid meat which is Sydney channel and Nyssa which is nestled on the other and there were three attempts to locate the city at the crossroads because it was a cross source of major trade routes and the first one was Frankenberg and that didn't work out they try to do another city it was called Levenstein that also didn't work out that I thought that maybe coming up with new names is not necessarily the good idea so the third city that they located there they decided to call after those two cities Frankenstein so Frankenberger one hand Levenstein on the other and this is short on cash sign is the castle and some combination of care obviously and it's not the same castle where the Mary Shelley's Frankenstein was happening because that's in Darmstadt in Essen but I thought it's a super cool name that my mother comes from Frankenstein even if she was a bit late to the city be called that way I want to talk about Frankenstein programming which is seamlessly combining more than one language and I show you how to combine two and a little screen to go for eyes three languages together in one even file and see how it goes there is this meme about Frankenstein not being the name of the monster but rather than doctor who created him like common misconception big man is actually the name of the clocks creator the correct name is victims monster if you don't like this meme I'm happy to report that my talk will be over in 30 minutes so benchmarks there will be a lot of benchmarks and I want to give you a huge cave yet about benchmarks a wonderful quote from along our McHugh and she said synthetic benchmarks tell sweet Fanny Adams about real-world performance of code architecture being a much more significant consideration that the proportion of Rome it's a given language will deliver on a given platform the average netbook could happily run all of stellar fusion bomb models I won't put the photo lumetri analyzes of all the Apollo missions in the puzzles between loading xkcd comics and pinning gem file without the user being an advisor don't try to fix your applications by rewriting them one to one in another language don't try to think that you know and I'm pretty sure this is the bottleneck so I'll just rewrite it always try to fix your architecture first and then try to benchmark and he actually where things go but don't fall for the synthetic benchmark like small benchmarks that show that this language is faster than the other one don't fall for this it's usually people have at least the sentiments to learn more of the languages that they benchmarking so it shows more or less I want to show this on a simple example there are a lot of people who say that you shouldn't do control flow with exceptions and there are good arguments for this there are bad arguments for this and one of the worst is that exceptions in Ruby are slow so let's see how slow they are actually I would assume a small active record so rails application that connects the local database so it's as fast as it can be assuming there is some kind of database behind it and it works on localhost so there's no network latency we create a simple tape whatever's with the one text : we can't go any sensor and we migrate the database up and down when the script finishes and then we compare how much it takes for exception miss so code pad that doesn't hit the exception exception hit a code but that actually raises something we rescue it because we need to like don't want our benchmark suite to blow up and then we do the simplest select which in in the case of in the case of active record is just a first row and then we do a simple insert also the simplest table you know no indices so there's no index overhead and so on and when we run it we see that yeah exception which is super fast and exceptions are super slow they're ten times slower but you know a simple select is 300 times slower still than an exception hit and you know an insert is almost 7000 times slower than raising an exception so don't fall for people who say like exceptions are slow make them show you benchmarks if you have data let's look at data if all we have our opinion says go with mine I love this quote because I use it all the time as the office um please please always benchmark first adjust your algorithm there is a urban legend / common idea that you can make your algorithm your implementation or bottleneck 10 times 10 times faster just by adjusting algorithm it's not you know universal but it's surprisingly often the case it's surprisingly often you can rewrite something and see that it runs actually an order of magnitude faster and you'll see it in a minute and also consider whether there may be our alternative implementation for your language so if you run on Ruby there's JRuby there is a Java will be coming up there's a there's been in the past other implementations and the same that there's a JSON for Python and so on and so forth but if you did all this and it's still too slow it might be worth way to log into other languages there is this kind of tweets like oh it's awesome crystal is awesome we can have you know a million requests per second on a single server always look at what they are benchmarking because I don't know what fear there is no link to the benchmark if I were malicious I could say that maybe it's just a you know empty file and what you should benchmark against is just Apache serving a directory and maybe that's what it's actually happening here so always always verify what people are actually benchmarking people so is the company that created the story tracking software the actual name of the software is Komodo monsters actually grew much more Thunder towards people after working with it for half a year maybe you know Stockholm Syndrome ladies and is anyway let's talk about binary representation let's say we have a set of elements for example integers and we want to do things with sets of integers this is not a made-up requirement it's not something that I only used in my PhD because I actually was working on Sofala - this is actually a use case from a startup that was delivering food to offices and they had an array of integers which denoted which offices have which food allergies and which menus trigger which food allergies or actually don't trigger them maybe so let's represent them by setting a bit on on a single integers rather than having a large set an RA or much better set actual set of integers that have one huge integer with certain bits being set to one so for example if you have a set of 0 2 3 & 5 you can have an integer where the view of second third and fifth bits are set and then you can do set operations as binary operations so if you have those two sets and you want to find the intersection you can do this binary operation of just ending or intersecting those two integers and this operation and any decent language should be one processor operation it's probably not in Ruby obviously but you know in a different language that actually knows how to compile things so that's why you get a different integer with this particular bit notation and you can kind of interpret it back as a set of two and three which actually is the intersection but what I want to talk about is the size of those sets if you want to have a size of the set you need to do something else called population console the number of ones that are set in the binary representation if you have number 6 then 1 1 0 is the binary representation there are 2 ones in that so for this set the pop count will be 4 4 this one will be 3 4 this one will be 2 so this is the number of elements in our set without leaving the binary representation butuan without having to count those elements at all provided we have this method so let's try to intermitted in Ruby in Ruby you can reopen classes including like core classes like integer and Patterson code this freedom touching so lets freedom bash integer let's add an instance method that is called pop comp to s and let's implement it like super nightly let's use the 2's method that turns an integer into a string but if you pass it a number it will turn it into a string of good of a given base so a binary representation is 2 s with the 2 as the base of the conversion and then let's count the number of 1 sub strings or sub strings with the character 1 in them if you're internally squinting now and in general discusses by this implementation I want to defend that it's super simple and super readable probably but also you might think that obviously there are faster implementations like obviously you can just try to shift the number write one bit at a time and then increase the counter by either 0 or 1 depending on what's the last bit and this is basically what it does is if you have the number like 1 1 0 which is 6 then that loop would fire three times one for each of those bits so this will only happen as many times as there are bits in the most like in the shortest representation binary representation of your number so this obviously should be so much faster than going to a string and counting sub strings in it there's also this implementation you don't need to kind of figure out how it works the gist is that it only loops then the number of times it loops is the number of ones in your string so if your string is in your number if your number is 1 and 27 zeros this loop will only fire once where the previous one will fire 28 times and then there's this implementation which is absolutely lonely and you could say this some you know random numbers that people just dreamt up but first it actually works second those numbers are not random like the five five five five in hex or 0 1 0 1 0 1 0 1 in binary the threes are 0 0 1 1 and so on and so forth so this implementation does some masking and then some calculations that are maybe faster maybe not we'll see in a second but then you could think like there are actually languages that have this stuff implemented there's raft which implements it has built-in method that is called count ones that you can call on an integer and it will give you the number of ones in that integers of the population count and thanks to the wonderful HELOCs project you can use Rast code from your Ruby application without having almost any having to do almost any work this is a rest file that just gets compiled by hallux and HELOCs then hooks it up to your Ruby in a way that you can call pop countable count and pass it a number and you get a ruby number back super useful hopefully this will be fast it goes through a compiled language that counts the ones but we can go even deeper we don't have to leave Ruby there's a ruby inline gem that lets you write C in your Ruby code you write it in a string and when this code is loaded this is like class level code so it executes the moment you load this file and this C code is compiled and then the Ruby code hooks into it creating a function that is named exactly as the C function that this code exposes so this gives you an instance method pop count bit Ln C into the into any integer in your code that's actually written in C an adapter also the bit elimination approach but if you have a laptop with a processor that is newer than November 2008 which you probably have then you have Nehalem or later architecture how I mostly want before send the bridge and ever since that architecture there's an SS e for point to support and those processors which means there is a direct assembly code that executes directly on that in one cycle that cow the ones and indecent C compilers you have exposed this kind of things in this case and GCC it underscore underscore building underscores pop cult L so this code is in pure Ruby on the first look at this of it at the moment it's executed the C in that string is compiled and the compiler knows to compile this line into a single instruction if your processor supports it this might be fast let's see but as I was saying not everything needs to be written in a different language to be made faster if we pre count all of the population counts for numbers from 0 to 65,000 and this takes less than a second then we can just you know slice our number into four parts we were like softly assuming from some point on that we are doing 64-bit numbers we can slide our number into four parts like mask them out and look into this cache and just add this like we have pre computed all of the pop counts for you know any 16-bit number let's see how fast this is and we can benchmark it very nicely the benchmark IPS gem does the kind of benchmarking we saw already but Ruby also you can call an instant method method on a class and it gives you all the instant methods as an array and then you can grep of the method that starts with pop count underscore from that array of methods so now we have an array of method names that start with pop count we do like a soft check whether all of our pop count methods return actually the same numbers for that random array of a thousand 64 bit or 62 bit in this case integers and then we basically loop through the methods and map the numbers from the array over the given method so we can actually compare how this works and if we run it on Ruby 2 point 4 we see that the built in one the one that compiles a C code that should compile to a single instruction can run like 30,000 times a second then the other implementation in c around 70 ten ten thousand times a second but the next one the first pure Ruby one is actually comes up at you know five thousand times a second it's not that slow interestingly the rest one is slower and significantly slower than that always benchmark the two s1 is only five times slower than the cached one so going through a string and counting sub strings in it it's actually only five times slower than doing the pre-caching and surprisingly maybe to a lot of us it was to me it's actually faster and significantly faster than the bit manipulations run in Ruby so the reason is that in C Ruby the 2's method is actually implemented in C the count substring one method is also interested in C so this is why it's faster than the bit elimination and the continuous shifting methods but there are other Ruby implementations and JRuby those numbers looks interesting the cached one is slowly faster the proc shift one is significantly faster bit elimination is significantly faster like twice as fast almost the two s1 is actually slower continuous shift link is again over twice as fast but JRuby all from JVM and JVM kind of sometimes needs a bit of time to warm up and also JRuby by default uses some of the invokedynamic mechanism introduces in Java season seven but we can actually force it to use more of them so we can give there will be a longer warm-up and force it to use as much info dynamic as it can and those number differ then I think about a benchmark IPS gem is that it tracks standard deviation so it can actually tell you that cache is slower than proc shift but the difference is smaller than the standard deviation that you had so they are actually same age and this is this is actual output from the benchmark IPS gem for which I'm super grateful because no longer somebody comes to you and tells you like this my algorithm that has three pages and two percent faster than your one-liner and even though the standard deviation is like ten percent so as you can see the numbers differ the 2's number doesn't match so again not only benchmark but actually no where's your benchmarking like don't benchmark JRuby on a kind of not warmed up JVM because the numbers might be different we'll see whether they're always different later this looks like Ruby but it's not it's in Crystal and crystals a compiled language that has actual static typing it just is very good at answering the types so the lines inside of this tract are down to every single letter exactly the same code as in the Ruby example this is like byte by byte and letter by letter exactly the same code as in Ruby except we reopen the instruct rather than reopening the integer class and we don't have this kind of dynamic for the word the dynamic introspection that you have in Ruby so we could do that with a macro but that would be an overhead so I just spelled out the metals that we want to call here but also crystal gives you a pop count method that has no suffix so it's like natural non implemented by us popcorn a total integer and here's warning Ruby times include Ruby side call overhead and I will talk about it in a second but just to remind you these are numbers from Ruby two point four with rusty and you know assembly through C and so on and these are the crystal times so you know sometimes you can get exactly the same code compile it with a different with a compiler for a different language and you get three different numbers and this is of course very inferred way of putting this but you know I just really like crystal so as I was saying never trust people who like one of the languages they're talking about I also really like Ruby um interest I don't like wow I like C to some extent so again you know always benchmark and a huge keg at at the top the crystal times are just by compiling crustal crystal they are not calls made from Ruby to crystal because there's a overhead and dynamic dynamic languages and you know it's not that the rest implementation is slower than the crystal one it just there is an overhead and calling the rest code from Ruby so this is not a fur comparison but at the same time crystal is very similar to Ruby so you might be and you probably should be tempted to maybe not necessarily implement part of your Ruby code and crystal but rather factor out and micro service and have a pure crystal micro service which is fast and just make it communicate with Ruby like outside of the Ruby code somehow okay the name of the center's creator and it's cookies monster not Cookie Monster read a book sometime when we when we cross between different languages we have to figure out how to pass things that the best way is not passing anything so have a like one language write something to the database the other language it's from database does some computation writes it back to the database and then this way you don't have to pass anything between those two languages the next simplest one is bullying's usually the rulings are implemented in some simple way in both languages so you can pass them and convert them on the fly in a way that is not really problematic with numbers we start getting issues because for example in Ruby you having teachers that are they used to be divided into fixed noms which are immediate and big names which are you know any large integer and in Ruby 2.4 they were combined into a single integer class so you can't really even like you can down to backward compatibility but you get a warning that you shouldn't be checking whether a number is a fixed number bignum and passing a bigger numbers than in Ruby's case to the power of 62 it's actually different than passing smaller numbers because the representation is different of dimension of big decimals strings if you pass a string from Ruby to see you actually have to know whether the string is longer than 23 bytes because strings in Ruby represented in two different ways short strings are embedded in a kind of 40 byte Ruby object because there's like out 23 bytes unused they're just like shorter strings are putting those objects longer strings are put on the heap and just there's a pointer to them so if you pass strings between Ruby and see like the macros are written for you but if you're writing your own binary language then you have to know that there are two kinds of strings in Ruby and actually the shorter ones are there used to be two times slower the the now they're the the shorter one okay the shorter ones used to be two times faster now they're 50% faster than then the longer ones are race in other languages in some languages in most languages you have arrays of one single type you have various of strings or ative integers in Ruby you can have arrays of anything including the array you're talking about like the third element of your array can be array itself and the ear we actually have supports for it it drives three dots and of corocut which basically I can't render it because it would be you know like infinite loop of renderings but you can put an array into itself in Ruby and you know how can you convert it into an array of anything and C and in hashes it's even better because the keys can be anything including a hash can be a key in itself so you know how do you pass the hash were some of the keys are strings some of the keys are database objects and some other keys or the hashes itself and the biggest problem is passing structs and objects which can have fields or properties or instance variables however you call them that are very different types so let's do a a second and last experiment let's do a code that you probably all use to some point if you do any kind of web application which is escaping entity so making text HTML safe which is basically replacing the symbols on the left with the entities on the right right so we need some way of taking a string and replacing the lower than sign or character with the unperson elqui semicolon thingy and you might think that okay the pop count was maybe comprised a bit I assure you it wasn't but this is actually I think something to a lot of people can read we can do it very naively as we will see we can have an entity scratch that basically have those transformations and then we have escaped each car drink we freedom patching screens obviously because we can and then we we take an empty string can you eat for every car in our string we push into the result either what's in the entity entities hash and the fetch metal takes another argument which is a fallback so if it's a you know double quotes it will put the entity and if it said the letter A it will just fall back to the second argument which will be in this case dot array but a lot of people don't know this but a lot of people probably should notice the G sub method and on strings in in Ruby actually can take not only a regular expression and some string but also regular expression in the hash and then the things that match the regular expression will be looked up in this cache that we can do something like this pass a regular expression that will match on the keys of the hash and then the hash of the replacement and finally hopefully many more of Ruby developers know that Ruby actually has a built-in escape mechanism for HTML editors entities you require CGI and then you call CGI escape HTML self so but again we can go to rust which is like super fast and also has this thing that is called zero zero cost abstraction so no matter how bad your rest is if your compile will actually make it so much faster because it will know how it should have been written and I'm actually teasing I absolutely love rust and this is not a case of zero cost abstraction Lucy in a second but is it like super super functional and rust let's take the input take the cars iterator and map over it doing pattern matching on every single character and turning into a string either a relevant ID string or turning the the underscore one means the default one the one that didn't match all the previous ones so for example the character a we turn it into a string we get a collection of strings and the collect method combines them back into a single string so this should be if you know the zero cost abstractions worked helped a lot of people say this works exactly has this one which basically lets create a mutating string and let's start with a capacity of twice the input length just because we know it will be at least as long as the input so let's conservatively just add another input length into this and then for every character either pushing to the string mstr which is like literal rough string that gets very nicely compiled into a string that is in your binary and then it's referenced into like memory of your like not actual program memory but into your binary or just push this character if it's none of those entities and then we return the result when we run it it turns out the rust rust push 1 is actually the fastest but the next slower is the CDI the pure will be CGI escape HTML 1 the gzip is significantly slower but the last map is slower still and then the each color on the Ruby side and the reason for this is because that's a very unfair way of saying about zero cost abstractions in Trask this is not a case of rust zero cost abstraction so again don't over interpret things that you hear about other languages just make sure you understand what zero cost attractions abstractions are in JRuby this is actually slower if you warm up JRuby some of this are faster significantly faster even but some of those are actually not faster so again it's not always like this that you warm up JRuby and you suddenly get a faster code and also the CGI the pure Ruby one so like the one at the bottom here this is so fast because again it's written C and it's just compiled when you when you install Ruby so in my crystal we can do this the g7 crystal can't just take a hash and the crystal just knows that if you know there's no regular expression and you want to replace the the keys of the hash or you can actually go into the crystal Center library and call the HTML escape method which obviously should be faster than the discipline right you know where this is going but you can also read a little bit about that method and turns out there's another method that it takes an i/o stream and takes a string and then it escapes the string into that iostream so we can do this you can create an in-memory iostream pass it into HTML escape and then you have to like turn this memory stream into it back into a string like obviously those last two should be exactly the same because obviously the way you would implement HTML escape with just a string in those three lines right well no the HTML one is the last one and it's 70% slower than the i/o one the gzip is actually somewhere between them and like obviously yes and a different person who what is this I created a pull request that we implement HTML escape into into the i1 and just to compare the Ruby this is actually significantly slower than the Ruby implementation so don't also think that you know crystal is so fast so obviously it's HTML escape will be so much faster than my CGI escape HTML no it's actually 2 times slower right it's half as fast so don't don't fall for this kind of easy thoughts that if I were implemented in a different language it has to be slower but it has to be faster because the language is obviously faster always benchmark there is a wonderful article called optimizing string processing in rest where our rest push implementation that is the fastest that we wrote is actually the opening and it says let's take a very naive implementation of this and see what we can do about the speed of it and the outer speeds about 5 spine so you know we can make it run in soap microsecond times and and you know just do it and rest and do some very bad things right a highly recommending I highly recommend reading this the sites or online the lingual will be at the end so you'll be able to click that so why does crystals HTML escape is slower the way it works there is a hash of of the entities and it's actually escapes many more entities and the five that we used and the self escape with a string so HTML escape with a string just cause a G's app with that hash on the string whereas the escape with string and IO through every character and fetches this into a stream and then it doesn't return anything useful because it actually modifies the i/o and the reasons are a little bit more subtle because the like jisub use of string builder which should be as fast as io memory but really isn't and this is where it lies but again always always benchmark and yep also when you're explaining the theory of relativity to physics students you can see although Einstein is the name of creator I mean Einstein's most it's it will be over some promise so the last thing and very briefly that I want to talk about is whether this actually happens like does anybody do this at all actually the JSON gem which is like super popular and you do if you do any web stuff and Ruby implements JSON in three ways there's a JSON pure implementation does in Ruby there's a JSON X implementation that is in C and there's a JSON implementation that is in Java and if you're installing it on C Ruby it will try to compile the C one andre rubies and try to compile the Chaplin and if it can compile or it's neither of those it will just use the pure one totally transparently this actually is being this approach is being used as we speak probably to serve this presentation Levenstein gem which is my absolute favorite gem that didn't get updated for the last five years so it's also perfect by definition it gives you the Levenstein distance between two strings which is how many edits how many deletes inserts and changes you have to do to turn one string into another and again it has this wonderful transparent fallback that if it can use this implementation it's super fast if it cannot it still works it just in Ruby and the other magic thing about this gem it works not only on strings but it actually can take two arrays and tell you how many edits you have to do between them and it can actually take any two objects that implement each and where each gives you an object that implements hash and equal so it can give you an average size distance between whatever your objects are provided they the elements implementation equal and the collections implement each it's really really cool gem very small recommend that you watch it and finally the first blank one which gives you information restoring this blank so empty or only white space characters and the pure obi-wan works pretty fast there is an array implementation in C and then enter ceramic very implementation in Ruby that works a little bit differently and it's in surprisingly many cases like 17 percent slower than the C one if you go into the first blank repository you have a bench point that you can run and see in which cases which one is faster I won't pronounce it out of respect to the people who mispronounce it all the time but well you know we're in France so when I was how your planner here anyway to wrap it up your plans for after this talk and after this wonderful conference please always benchmark try to change your algorithm see whether there are the same implementations of your language and only after try to combine languages together weird ability usually Trump's performance - like it's actually it's always that it always depends but if you say that your algorithm is you know two percent faster and it takes again three screens in comparison to one-liner make sure your whole team is on board with this because it's them who will be maintaining that code and dynamic languages has a method call overhead that you should remember about so maybe if you're trying to do this in a compiled language think whether you don't want to extract this into a micro service with the source both rust and crystal are self hosted which means the compilers are written in rust and crystal so it's not like if you want to contribute to Ruby you have to either no Ruby & Co Ruby and Java if you want to contribute to christou you just need to know crystal if you want to contribute to rust you just need to know rust and again the sources are usually quite readable so do it never create ruby strings longer than 22 characters obviously this is a wonderful article by Pat Shaughnessy and I highly recommend you read it there is a wonderful article by Luca greedy from the hanami project we're about writing rust gems with rust and Alex you can write Ruby extensions with crystal with the huge Cadia that you should really consider just writing crystal and if you if you're a goal person then then you can just write Ruby extensions and oh and this will also work so that's last one actually Frankenstein knows the name of the scientist either person correcting beyond its trivial point of the monster thank you so much the slides are online I'm from rebates we are super happy to be able to help a little bit with Polycom to this year we have kid stickers if you want you can find me try to ask a question and get a sticker or not ask a question I'll get a sticker and thanks so much the sites are up there [Applause] [Music] [Music] you