Evolving Complex Systems Incrementally

0 0

Hey, everyone, I had to set everything up for just a moment, and you know mac os and Apple software. So hey I'm Christoph I work at Facebook as front-end engineer in JavaScript infrastructure team. I'm going to tell you how evolved our front end source space. And I'll teach you how to do large scale tool assisted co‑changes to your JavaScript code effectively. And the word that we use for this at Facebook is codemod,code modification,we kind of invented it, that's just what question call it. And at Facebook sometimes we use it inter change establish sometimes it'smanual change, sometimes it's an automatic change, it always refers to large scale changes to a big portion of our code base (Manual) at Facebook we have tens of thousands of judgment modules and we need to make sure that the health of our code base is very good and we have a good handle on that, what's more is that we want to enable every engineer at Facebook to move fast, even with breaking changes. We often add new shiny abstractions, new language features because it's always easy to add something new, we tell people, use that, use Latin ‑‑ not var anymore, we never go back and tell people, hey, we need to clean this up, can you deprecate your code and remove it. It's always really hard to get rid of deprecated code, especially if you have thousands of callers around your code base (Deprecated) browsers also new capabilities, frameworks gain new features as well. We need to somehow handle this. Everything just becomes more complex. The problem is that even though we add all those new features, the health of our code base more and more declines because we have multiple ways of doing the same thing. We also want to encourage people to always just write code using the new patterns, but, especially if you're new at a company or new at a code base and you just started looking at it and you true to figure out how something is done you usually look at how other people around you did the same thing, then you discovered all the code, you tonight know if it's deprecated or not, you use the code around you and you write it like it was the latest and greatest thing, like more and more deprecated code even though you shouldn't be using the patterns anymore. We often have hundreds of thousands of calls of something, we have a web UI that allows you to search quickly across all the code. I had to make a change and I was looking at how many lines of code and how many invalid patterns are we using there that we have to update, and in this case there were almost a hundred thousand results and then it tells you, oh, I cannot display this to you because it would crash your browser. So going through every call set manually is not a solution. It's not even an option for us in some cases. This is a codemod they ran on our code base, we have a tool called codemod, it's very simple, it's an internal tool, I have this string and I want to replace it with this other thing. This has served us really well to evolve our code base over time, it's really good when you do a simple rename, you rename JSCONUS and up load to the web site and you're done. But we're way beyond simple renames and also the syntax for JavaScript becomes more and more complex and also ambiguous, so we can't really use regx anymore so this is something I actually ran at our UI class at Facebook. It has been around for a long time, but we use an early class of ES 1 , ECMI script it was just final iced now called ES2015. Hopefully it's 2015, if I say ES6, I'll buy you a beer later. So I haven't done regX code much in a long time, I figured I can do this with a renaleX the problem I tried to solve was add a new call to create an instance of the URI class instead of calling as a function, we had a pattern for the same path, we had a code to allow tin stance of this class, with new and without new. figured, okay, I can do this with a regX. So I wrote one regX didn't catch all the cases, wrote another, didn't catch all the cases either, wrote a few more, then at some point I was in a mess for some reason everything was broken because one of my regX wasn't very good,I probably had a star, or a dot or something, I was just really frustrated, then I rewrote this using an EST base codemod and it was don't in two minutes and basically two lains of code. I mention ASTs, codemods that operate on the AST, abstract syntax representation of JavaScript instead of source text representation. They can really help us make a difference. If you build a strong codemod for an API change you're making you can rework and apply it every time you make a change to the code that you're trying to build. And what about rebasing? When you manually update your code on a long‑lived branch when working on a new feature, you have to redo changes potentially every time when you pull from master or something like that. With AST codemods I can rerun them before I push my change, done. But, I also want to talk about the frustration around the deaspiration that one often feels when they're thinking of making or braking API change. Good codemods are really hard. And I'm sure, everyone of you, at least once, decided not to run a codemod because it seemed too difficult and they just kept the old API around, didn't make a breaking change, it just wasn't, you didn't have the right tooling for that. However, quite often, there's simply no other way around codemods. We can't just rewriting write our entire code base because we have a couple of bad abstractions. We must find ways to evolve our code base incrementally or well change complex systems incrementally. With a clearly envisioned end state in mind. At Facebook we hack everything and I put up these posters around the product infrastructure team and someone guestening this and made their own version of it. I actually like this better than the original title. But we also need to be confident when we make these large changes to the code base. There's no point in running a codemod if it disrupts every single engineer at your whole company. The whole point of this is not to break anything and tone able every engineer to move faster. And codemod tools help us do this. And the tool that we build at Facebook is called JS code shift it was built by Felix Kling, he built it more than a year ago then he kept it secret for a really long time. He was sitting next to me at the time at Facebook and we casually talked about this new API that I wanted to build and I had this idea of what I wanted to build and I just didn't know how to get there. It was impossible to do. And then he was like, oh, yeah, so I built this tool, it's probably not very good, and I was look, okay, let's take a look at this. I looked at it, I was like holy shit this is amazing, we have to Open Source this and we the. This is really good. Let's dive into how I got started with this kind of codemod work. It's exciting to be able to talk about how we built a relay framework. I know if you heard about prelate it's a data framework on top of fame work. This year at React JSCON we wanted to build a new API, wanted to move away from create graph QL.mix in to relay.create container. So in relay we used to have this mix‑in, but used internal React APIs we couldn't shift to Open Source API, it broke all the time because we act changed all the time as well. And we also couldn't ‑‑ because we were using a mix‑in we weren't sueable to support ES 2016 class components, it was also a way cleaner implementation. We really wanted to move to this. But the problem is, this completely changed the full public API relay and admittedly it's not a big public API, but it changed the entire API. At that point in time, herbally this year, there were already hundreds of files using ‑‑ written using relay, and the product teams using product ‑‑ was moving really fast. Building this enough API took about a month to complete. I built a codemod and had to run it at least ten to fifteen times to test different stages of the new API. If I had done this manually, while we were building this new API for a month, I would have had to change those hundreds of files manually every time when some product developer changed something and it would be a huge mess. It's painful, just to get a better idea of here's the before and after. It doesn't matter if you know relay or React, t matters that you have pattern matching capabilities in your brain and just compare those those code pieces side by side the left side is before the right side after. If you ever removed a mix in and made a higher order component yourself you know what the challenges are involved here. But in this case, the original example, I guess this laser pointer doesn't work that well here, we had a mix‑in up there and statics the new version didn't have the mix‑in anymore, and then we had on‑line fifteen the relay.create container that was thereabout before, and then also we moved the stat ticks into the secondary argument, to the function call. And traditionally we would solve this with regx, right. A lot you this doesn't seem that difficult, I can probably do this with a renaleX, I could maybe do it, I wouldn't be confident and I'm sure it would break badly. And here's why, I could probably bulled a reg X from the example before, look at the variation of a similar module on the right, it's just a different developer built a similar module with a different code convention. This had a second mix‑in so we can't remove the whole mix‑in line, had different stat ticks and different in how it was exported. But also for this change I had to add a new module that didn't exist before and remove an old one from the require Blog. We have human code conventions as everyone does, I had to enter the new requirement statement at the right alphabetical locations, this is just a lot of work with regx. You know. This is what I wanted to show you what can be different from one module to another module and quite often that night be two people working on the same team, working on the same product, you know, and the whole point of this is to output some code that looks like a human had written this. So, I need to figure out how can I write a codemod that looks like I wrote this code. But there's more, what does this piece of JavaScript mean? Does anyone know? Anyone seen this before? Everyone's quite. So we don't actually know because it depends on what context this is used in. So this is an entirely different piece of JavaScript. Anyone, anyone has an idea? Everyone's shaking their head. So the first win, this is actually a block statement with a label called JavaScript and the value that it maps to is a binary expression is awesome and concatenates that together. But the second one, this is an object expression where the key is JavaScript, it only has one property and the it's awesome I've talked about why ‑‑ what is JS code shift? What does it do for us, simple description is an ST toST codemod, we need to dig deeper and figure out what tools are involved and what concepts do we need to understand. So let's ask ourselves what do we actually want to do here? We want to take one piece of JavaScript and output a different piece of JavaScript, and we're going to operate on the ST, so we need to parse JavaScript source code into an ST then we need to find the nodes that we want to replace, create new nodes that we want to insert, then update the EST at the right location and then print it back into JavaScript source ideally with proper formatting and it should look like a human wrote this. I want to point out that all we do here is simple tree operations. As Front‑End developers you should all know what this is doing, right? We're dealing with the DOM all the time we find, create and update nodes it's same for ASTs, let's look at these step‑by‑step. First up parse, I talk about the abstract syntax tree. I'm pretty sure most of you are look nope, sounds like stat tick analysis, it doesn't matter if you know ASTs or don't, I'm going to show you what an AST looks like and you can time me if you want to. There's a tool that we built at Facebook called AST explorer. I'm go going to ‑‑ (Explorer) cool, I'm at the demo. This is not very big, this is better. All right, cool. So this the the AST explorer, you can follow along if you have your laptop open. All it does is it gives you two panels, there's the code you can write, and then the AST representation of the same code. AST tonight be scared, it's just a simple JSON data structure, this tool helps visualize, in this sample code we have two statements and one of them is a variable declaration and one of them is a function declaration, it's not very exciting, it's just ‑‑ it's initialized using an array expression. It's called tips and that's an identifier. Let's look at some other code. So, I showed you this example before, right. And you were shaking your head and you're like what are you talking about, this doesn't make sense, this looks like an October. No, actually this is a block statement. It has a label here, the label is called JavaScript, and the body of the block statement is binary expression. So, now, let's wrap it in parens and now suddenly this is an object expression, it has one property and then this property has a key that's an identifier called name and then a value with a binary expression. That is awesome expression. All right, cool. Remember what an object expression looks like, I'll go over this in a moment, this is important. We also can do function calls, a little bit of German here. In this case, this is also important, there's a call expression, thecal Lee is called hallow, and the arguments is a string literal called JSConfEU. All right. Let's do something quick, we can create a class JSConfEU has awesome food. Does it have awesome food? Yes, hell yeah! This probably extends probably JSCONF, we can see this is a class declaration, it's ID, it's name is JSConfEU, it has a super class it extends and then here it has a class body and it has a method called has awesome food, and of course, this somewhere returns true. So that's an AST. So next I want to talk about create. We're going to skip over find, I hope you allow me to do this. I said before we're dealing with a simple objected day the structure, we could build up an object and ‑‑ A ST, you can do that but you tonight have type guarantees and probably end up in a mess because you make a Typo nothing works anymore, but there's a project created by Ben jaw listic anyone who used to work at Facebook, it has a the similar syntax to Babels creation system. This is an API to define AST types and then it gives us a functional API to compose those AST nodes, gives you type safety, tells you when you cannot create an AST node with what the type you in the mind. You saw the object expression before, here on the left side we define an object expression, basad on an expression built up using properties and the field properties is an array of property and you can property right below that, property based on a node, built up the kind the key and the value. The kind is iter init, or setter, usually init, we're scared of getter and setters, and then there's ‑‑ getters and setters, a key (Getter) a field value that can be any expression, the value can be any expression it can also be an objects expression because objects expression is based on an expression. On the right‑hand side you can see how we build this up, basically we get a function called objects expression we pass in the property, and if we were to print that code, it would print JavaScript is awesome, over course depending ton context it would add parens, otherwise it would be a label statement and we don't know what to do. There's more in JS code shift you can type it out in a template literal and then reparse that JS code and then you can actually ‑‑ using substitution to output proper AST nodes. And then, this is from an example that converts four loops into a ‑‑ loop. This is really cool. Going to skip over update, hope you allow me to to that use. We talked about parse and create, let's talk about printing, this is really hard. The project we're using for in is recast. It was created by Ben newman also. it's really awesome, gives us one API, it's called recast drought print. What's the point of 24? ‑‑ what's the point of this ‑‑ it ‑‑ if you build a JS Compiler like babble you don't care about the out put of your transformed code (Don't really care) the only thing that see it is output is the browser, and we usually minute FYcode, if you write a codemod you don't want to find yourself in a situation where the code you printed that you didn't even touch looks completely different from before. If I do this, if I run a codemod and I only change one tiny little thing here and output looks component model plately different and I run it on tens of thousands of JavaScript modules every engineering company is going to be annoyed by me. I don't want any engineer to know I'm even doing this. (Don't) you can provide options on what the code style is, that you ear using in your code, like indentation, what's even better, it doesn't try to reprint the code you didn't touch that's important, you're only changing the things that you're actually changing. And this is why we cannot just use Babel which is a Compiler to to this not yet any stay tuned. We talked about creating ST nodes and hao we print the ST back into source code. For find and update this is were JS code ties it together, provides a neat little package for you to do these AST codemods, in the past we simply didn't have tool set enable us to do this. We didn't have a standard for JSAST a couple of years ago, we needed all these tools before that. So let's do a demo, everyone loves demos, we'll write a transform that get rids of a hypothetical module called merge, this is a module that exist in the three different variations at Facebook and then I killed them all at Facebook, they're not there anymore, you can search the source code, it's not there, I promise. Took all the arguments you pass into it create add new object and created all the passing properties into that new object create add new object. Then we created a new syntax for React in the beginning we realized this is also awesome for objects we can use the spread property, the three dots right there that you see in the second lain to spread all the properties of an object into a new object, this is really powerful syntax. This is not a standard yet, but we hope it will be in ES 2016 or 2017. We're transforming what you see in the first line to what you see in the second line, ironically because it's not a standard we're using Babel to transform it into third lain for the browser. So this is actually really funny. But given those merge calls that could be nested. And that could be nested merge calls and really deep object structures you could have like whole function in the merge calls like we had a module that pulled in another module and then merged new functions on to that module. This is impossible to deal with in reg "X,"just cannot be done. And how do you form at the output, right? If we tonight have recast we don't know how to format what we're doing here. So, I ran this code it was no big deal, I got rid about a thousand caller of this function and four thousand callers of the other two modules. No one noticed I had to tell people afterwards. But it's gone and now everyone's using the same thing and hopefully will become a JavaScript standard. We have a call expression here. It has three properties (Gone) it has an identifier, I'm just going to show you what the output is going to look like. We need to parens as I showed you before. It's going to be something like A, object expressions be inlined. Two of them are spread properties and one of them is a spread property. All right. So we built this AST explorer and we're look can we put JS code into this and we did. Let me just so we got this live coding environment here, it's really cool. So the JS code API is basically you create a file and it export asfunction that you can then apply using a runner that runs in concurrently across your entire code base. It gives you the file, the source of the file and the API of the code shift here on‑line five I'm mapping it to J, J does everything for us, writing it quite often that's why it's a short one character identifier, we just pass the thing into J and gives us this cool object, it's magical you should read the documentation (It's magical) this is the standard transform, we find all the identifiers and we update them. What we do is we just ‑‑ reverse the identifier. This is not a useful transform unless you wrote the code originally right to left. Any way cool thing is we can live code here. If you're feeling German you cupper case all of your code and it will still work, which is awesome. These not very useful for us. So we realized up there we need to operate on call expressions, so this is a call expression. And then it doesn't work. It's just the name is actually call.name, ‑‑ calllee.name, now there's only merge, we don't want that. So we want to replace it with an object expression, that's what we saw before, and that we noticed that takes an array. All right, that's cool, it doesn't really do a lot, but we replaced all the personal calls with object expression. The problem is, this is not merge, and it's also replaced. We don't want that. So this gives us the ability to actually do pattern matching here. All right, so now we're only actually updating all the merge calls, and then here, what we can do is, we map over all the arguments so we have DST note here. Then we have the arguments and then we return a spread property with the argument. And boom, we're done. Let me just format this, whatever. Okay, but this is not what we wanted, all right, we wanted to align the object, let's look for the type here, object expression, and then return all the properties of this argument. Instead, all right, this doesn't work, we actually need to use this really awesome short version of ES2015 code with recursive flat implementation. And boom, now you can see that we successfully replaced that personal module. So this is actually just five lines of code. All right ( applause) and to me 24 was like learning a super power. When you learn how how to do this you can write Babel plug‑ins, ‑‑ so one more thing, JavaScript now has a class creation mechanism that we're supporting in React, I figured we can transform React.create class with a codemod, superintendent that fun? I built a codemod can automatically do this, if you look at the example, inlines it into the new con instructor, it removes this access and props because it's now local variable, it does a lot of stuff. It also adds the prop types later on. But, I talked a little bit about this, we the a demo, just like a tiny little sample code, that's not really fun, you know. I found this project on get hundred that's did he recall UI for React. Now let's run this thing. Takes a little while, can I wants over all the mix‑ins because we don't have a good solution for that, but you can build your own codemods to get rid of them if you use them a lot. Takes a little while. Tested it yesterday, I'm 200 percent sure it works really well. Now we're done it used all the cores of my MacBook and transferred all the files, yeah, looks about right. Nope ... can we send a pull request. Do I have internet? I should probably push it. All right. Where is it had are? Button is not working, all right, here it is. All right (Applause) there it is. This was fun (Laughing) that was 845 files and probably 5‑10,000 lines of code. If anyone asks I didn't touch the code, I just wrote the JavaScript that wrote the JavaScript, so my favorite editor is JavaScript. But sometimes you cannot fix something automatically, not possible, we tonight have all the tools yet. What I found at Facebook is that you can bribe really well with t‑shirts. So at Facebook we have been using an early draft of ES2015 classes for two and half years. It allowed every edge near at the company to move way faster that they could have using ES 5 features. You have to consider that a lot of engineers at Facebook didn't grow up with JavaScript in their heart like all of us here did. You know. The problem with our implementation of classes was that we didn't require soup erto be called in a child class something that's required in ES2015. We had about a thousand invalid callers, how did I detect them I wrote a codemod script to detect the code. We were able to fix about600 of them automatically because they the president have interesting parent con instructors, four hundred we had to fix manually. We do hack tons everyone at Facebook, promised everyone a t‑shirts, we had 400 call sites per person. And we did it. Everyone got a T‑shirt, we high fived each other, it was great. But the whole point of this is that I want to point out that codemods are not ‑‑ they're not going to solve all your problems but they're really useful and assist you when you do manual clean ups. what's the impact on Open Source. It's important how we approach Open Source at Facebook. We only Open Source what we use in production and add scale. The React team they're responsible for deploying new versions of React inturnly on ever respousetory, they're responsible for making sure that we move to new APIs as fast as possible. And they're making sure that we move fast even with breaking changes. But we always use master. If master is broken Facebook is broken. So this should explain why we have invest in the this kind of code tooling, we have a lot of React code . But the tooling is influenced how we approach Open Source code, if you have looked at recent release notes of React native or React, you see that we actually published codes mods for React and React DOM political up in 0.14 you can run a script if you have a large code base and it will be fixed auto3459ically for you. 24 is great. When we make breaking changes and we can write you a script and up grayeds your code automatically when you update your version of React, the whole community wins. And people have been really happy with this. This person found this tool and in less than two hours he safely transformed thousands of lines of code, this is outside of opinion, it's really cool. What I found about this is that I recently rap a codemod on every test file at Facebook. And a day later after I made the changes I wanted to figure out how many lines of code did I change. It was 40,000 lines of code, I didn't notice that when I was doing the change, no problems, all the tests are still running, everything's great. One more tiny little thing, I'm sorry we're a little bit over. Then Abramov was asking for an on‑line babble editor, I was like didn't I show you the same thing with JS code shift. Do you want to see that really quick? All right, this only takes ten seconds, I promise. So here we just go into Babel. And then what we're going to do is Translate our Babel script to German, because that's fun. That's it. Thank you so much for listening. (Applause) wow, that was awesome. Thank you so much. So we're just going to do a very quick scene change here and bring up Thomas who's got probably the better GitHub user names, it's Watson and he's ‑‑ come on up, get set up, man. In the time honored tradition of writing lots and lots NPM modules you can see Thomas has written such hits as bay 64 and coding things as Emoji. Enabling the T‑Cup protocol. On HTTP servers. While you guys get settled, we'll just get set up here.