The Hitchhiker's Guide to All Things Memory in JavaScript

0 0

um thanks everyone says Jane mentioned I'm Sophia you can find me as Captain Sophia everywhere on the Internet Twitter github deviantART probably Neopets - or you can visit my website which is Sofia dot rocks is a bit of a content warning to everybody this is a technical talk but there is no code or demos so if you like come to conferences to see semicolons and curly braces on a really big screen I'm so sorry to disappoint but that being said it'll be fun nonetheless if you're sitting towards the back or if you have some sort of visual disability if you visit JMP /j s - ma'am - live you can follow along live with this talk as I go through the slides and skip ahead if you'd like to - so I'm a bit of a computing history buff so I'm gonna start this talk all the way back in 1936 when a young man by the name of Alan Turing theorized the concept of what's known as a Turing machine Alan Turing was a brilliant mathematician he actually helped the Allies when world war ii using his amazing math and science skills and the Turing machine was a simple abstract machine you can see a physical rendering of it here and it essentially processed bits on a strip of paper using a set of rules that were loaded into that big central box you can think of it as the first ever CPU now a couple of days later decades later there was another smart guy by the name of John von Neumann and he theorized what's known as the von Neumann architecture which basically said that a computer was something that had inputs and outputs had a central processing unit and he had a separate space for long term and short term memory these were the sort of two early conceptualizations of what a computer was turns idea was remarkable hiim that can think is remarkable von Neumann took it further a machine that can remember what it thinks is revolutionary and if the past 80 years of history have been indicative of anything a machine that can think is indeed revolutionary and today I want to talk about the work that computers do to remember how they manage memory how it can build memory performance applications and JavaScript and node and you're probably wondering why do we care it's 2016 my laptop has 16 gigabytes of memory is is this really a problem in this day and age and the answer is yes it is and yes you do need to care because although the amount of available memory and machines has increased rapidly over the past 40 years or so so have people's expectations and demands of what machines can do people demand web experiences that require a lot of memory intensive render and compute tasks and you have to work around that and the other thing is that not all your users are using the latest sexiest tack right not everyone has access to the latest technology but that shouldn't inhibit them from having the same web experiences that all your other users do and the last reason is a matter of pride you don't want to beat that app you know that app the one that takes up so much memory you have to close windows the one that takes up so much memory that it forces all of the other tabs in the same process to shut down nobody wants to use that app nobody wants to write an app like that you don't want to subject your users and customers to that kind of pain and to do that you're gonna have to go to class and learn a bit about memory management the first thing that we're gonna do is learn a little bit about what's known as manual memory management and I think I saw a couple of people in the audience start to cry because they know what I'm about to talk about so if you've programmed another note low-level programming language like C you're familiar with low-level or sorry you're familiar with manual memory management this basically means that every time you want to use something say a list of names of wines in a restaurant that's kind of the data that you want to use you're going to need to manually allocate space on your machine for that data before using it you're done using it you have to manually clean up that space yourself it's an explicit process you can imagine this is frustrating it's no fun we're not made to manage data ourselves and keep track of it it's painful you get code that's filled with errors and memory leaks and all sorts of other nasties thankfully in the JavaScript world and in tons of other programming languages that's not the problem that we have to worry about because we have something known as automatic memory management or garbage collection and it's kind of obvious in the name the automatic memory management is the process of automating the process of allocating memory and then clearing it up when it's no longer being used and garbage collectors are generally provided by your programming language in fluent Asian they take care of figuring out when memory is no longer active or being used and cleaning it up they've got all sorts of smart things to do it efficiently and the garbage collection technique that we're gonna look into specifically is the one provided in the v8 JavaScript engine there's tons of other JavaScript engines of course but we're going to talk about v8 specifically because it's the implementation used by node it's also the one used in chrome so it's relevant to you front end and back end people bringing us all together so memory is allocated into it's known as a heap data structure you start off and you have a route object in this case it's that red node and it points to all their objects that are being referenced the green nodes are scalars those are always the terminating nodes in our heap this is what the memory looks like and memory and VA is allocated in two different spaces depending on different kinds of properties I'm gonna go through these properties really quickly so the first space you have is the new space this is where a memory is first allocated data and this space is generally pretty small and once it's hung around the new space for a while it graduates into two different spaces if it's a pointer or reference to a scalar it gets moved to the old pointer space and if it's data in and of itself if it's a string if it's a box number or an array it gets moved to the old data space and there's a fourth kind of space separate from those two and it's known as the large object space the name is also pretty obvious it starts large objects generally larger than eight megabytes and then the fifth kind of space is known as the code space in the code space you have code instructions that are compiled just in time the main distinction between the code space and the other spaces is that the memory in the code space is executable memory meaning you have permission to execute the data in the space since code is data you obviously don't want that in any of your other spaces because somebody could put a nasty in their data and like do lots of bad things people use computers for terrible things I never knew so that's all we allocate objects into memory and some of the heuristics that are used to identify what places in memory they should be allocated so I've got them allocated for programming it's super fun we're like changing the world saving people building unnecessary node modules who knows but remember this is garbage collection so there is a cleanup process involved in the allocated memory how do we know when something needs to be cleaned up and the concept that we use to determine when a piece of memory is garbage is known as reach ability and the idea is pretty a piece of memory is considered garbage if there is no way to get to it by traversing from the start of the heap at the root object so if you're looking at this particular memory these two holy pears would be considered garbage because there's no way for me to start at the root object in red and traverse to them they're just like lonely and isolated kind of like me dark joke sorry no I'm not lonely and isolated I have all you guys and so we've kind of covered this general idea of garbage collection we've covered some of the heap structure that's used and some details about the aid how exactly does it collect garbage I will v8 employs something called the stop the world garbage collection technique which means whenever it needs to clear away unused memory it's going to pause the writing program and execute that garbage collection cycle and you might imagine that there's some performance setbacks to this obviously this is made better by the fact that most of these garbage collection cycles where you halt and then run a garbage collection cycle happen in short chunks of memory so that you're not running really long garbage collection cycle on a huge piece of memory for a really long time you're just doing it in first and you can optimize it even further by running a garbage collection in an instance of idle compute time to take advantage of that wall and this process kind of repeats you allocate memory as you program as your memory fills up it gets cleared away so on and so forth and there's a really interesting distinction that VA takes when it's cleaning up memory so you have data in your new space which is smaller it's generally newer v8 takes one approach to clearing this kind of memory and then you have data in your old data space it's kind of graduated it's matured and v8 has a different approach to clearing away that memory and this idea of different approaches or using different techniques on different kinds of data is known as a generational garbage collector so if you want to like show off that you're smart tell people that v8 employs a generational garbage collector great pickup lines for everybody the first technique that we're going to look to is called scavenging this is exclusively used in the new space so it's filled with small fresh datum and what happens is v8 goes to the root object of our heap it traverses through it and every time it hits an object it makes a copy and then it clears away what was left and the effect is pretty simple if anything wasn't reached in that traversal it wasn't copied and it was deleted when it cleared away that existing memory the general summary of this is things are copied from a - space to a from space if you're thinking about this algorithm you're probably like this is kind of a really space intensive process cuz you need twice the space to execute this this ends up not being too big a deal because the data in your new space is small anyways so you don't have to worry about it that much so it's totally fine just and then there's another technique that's utilized in the old data space and it's known as mark-and-sweep and it's a two-phase technique so in the first phase which is the mark phase you traverse through the heap marking all of the objects that you reach and the second phase you sweep which is you do a cleanup now I could go into way more details about this the mark and sweep technique is one of the oldest garbage collection techniques it's been modified a ton through time but it's like 4:15 p.m. and everyone here is thinking about a big cold beer and y'all don't want to hear that but if you do come talk to me afterwards and also feel free to google it around yourself or whatever now you're probably thinking okay I've got this kind of knowledge of the theory of garbage collection of memory management to have an understanding of how it works in v8 like how do I start to use this practically what kinds of questions should I ask when I'm coding to start to think about memory in my application more actively there's two simple ones you can start off with first is how much memory is my application using it's a pretty simple one the second is how often do garbage collection cycles occur in my application how often does the trash have to be taken out and to answer these questions you need the right tools for what can be a pretty messy job and one of the best tools is actually right in your browser if you're using Chrome I'm a bit of a chrome fanatic obviously it's the chrome dev tools and I could like walk you through the screen shot of like what's going on in all this but instead I'm gonna do a quick demo I know I lied and I said there were no demos but I don't know if this counts as a demo so let's try it out I'm really quickly I'm gonna pull up if I can I'm gonna exit full screen am I in the right place to do that oh yes thanks yeh teamwork okay awesome so I've got the Cascadia fest website this is terrible open on my machine here I've got chrome dev tools open I wonder the profiles tab when you hit this tab you're going to see what's known as a heap allocation profiler that takes a snapshot of the heap in a running application and I'm going to show you one way to use this profiler tool so what we can do it's a really handy features we can look for detached Dom elements these are elements that are no longer present in the tree but they're still being referenced in the JavaScript and to do that we can hold head over to a class filter here do a look for detached super useful let's look at this one entry here it's the first one and we can expand it and it looks like it's a detached anchor tag and there's tons of useful information in the columns to the right of the constructor so you'll see what's known as the shallow size that's the literal size of the object in memory you'll see the retain size that's the size of the object and all of the objects it references it's known as the retained size because those are all the objects that it's retaining as active in memory or alive in memory um you'll see the distance was just its distance from the root node you'll see here at the bottom it looks like it's defined as a free variable in system context which has a general term for a closure it's just somewhere in there we've got a nasty closure in our application you could use this to debug all sorts of things like memory leaks programs that are using too much memory I'm gonna actually go a bit into those in a second now we jump back into our presentation this is always the awkwardest part of any tech talk where you're like let me do a demo and switch contacts and then switch it back so that's how you might do some profiling and debugging on memory in the front end if you're working on the back end on the node side you're obviously going to need to use a different set of tools there's a ton that you can use um so there's mem watch not mean watch oh my gosh that's so embarrassing there's a typo in my presentation I could turn into a joke that's okay so this mean want mem watch now I'm pronouncing it wrong to this platform-independent it's an age of module you can use it for memory profiling there's heaped-up which you take snapshot of the v8 heap you can actually load that snapshot into the chrome profiler and analyze it there's a VA profiler which contains the node bindings for the profiler that's used in e v8 which is in C++ and if you're at a larger scale there's tons of performance monitoring tools I'm going to avoid promoting or endorsing anybody up on stage but if you're curious about kinds of tools exist if your account a larger application scale come talk to me afterwards and I can help you um so you're probably wondering when should I use all of these things when should I apply all this knowledge like is this really necessary so I'm gonna go through some examples of things that you might hear from your users or your customers and explain some memory related issues that you might want to look into so the first is you get a user who's nice enough to file an issue or a complaint and they say I have your to-do list app open in my window in my browser all the time and then after a while it just makes my browser crash why this is textbook memory leak meaning that there's a logical error in your application that for the force is more memory to be allocated over time Chrome comes with a timeline allocation feature which allows you to explore how memory is allocated over time and when your garbage collection cycles happen so that you can start to investigate that someone explains your app has Jenkin jitters and Nolan explains some reasons why you might that might be happening yesterday but if you're thinking about some memory related issues that that might be occurring it might be because your application is running its garbage collection cycles too frequently this is another time where you can use Chrome's timeline allocation feature or whatever profiling tool you have to see how often garbage collection cycles run and in what instances and what spaces are filling up faster anyway and then someone just says like your app just sucks and you're just like no I mean that's a big question it might just mean that your application is using too much memory in general all sorts of hairy problems the one thing that you need to recognize is that memory behaves differently across different implementations um so the way it's cleared and processed in one browser is different from the way it's cleared and processed another browser and the steps you need to take when analyzing problems like this is figure out you know what browser platform combinations are most popular for your users and then start to do like a dedicated debugging analysis for each of those browsers and it might take you forever but the results are worth it so I guess there's that memory management is a hard problem computers have had memory since von Neumann theorizes architecture in the 40s and they're going to have to memory for a long time and we're going to have to manage it regardless of what sexy framework we're using things like v make it really easy and include lots of optimizations that hide the internals for us I'm gonna be a total mom right now and ask everybody to eat their vegetables and appreciate their garbage collector and understand how it works because sometimes it's the seemingly unsexy internals of a programming language or a platform that are the most exciting unless I'm just really lame which I hope not that was the end of my talk if you have any questions you can either come up to me afterwards or hit me up on Twitter if you're nervous about reaching out on a public forum my email is Sofia it's off the adat rocks make sure you have a funny cat Jeff in there just to make it great for me that's it thank you so much everybody you