Clojure At Scale

0 0

right so my name is Anthony Makar I work at Walmart Labs is a closure developer and I'm going to talk a bit about closure at scale so when I talk about closure at scale I'm talking about we've kind of been using closure at Walmart for about three years now like a lot of people in the closure community when you're operating at a little bit of scale a larger than normal it's a bit of a frontier there's a lot of tutorials out there on how to create small closure applications you know little libraries here and there but in terms of resources on how to create larger applications that's something that we didn't find too much of so this talk is basically about our experiences that we've had some of the I guess things that we've been able to find out that somewhat work so I don't want this to be like this is how you should do it this is just our experiences and recommendations that we've come up with everyone should be doing these things differently it's a frontier but anyway this is what we've come up with so a little bit about our system we run a team called ear receipts within Walmart to the customer at a store it's it's a few things it's basically every receipt that's printed in the store has this QR code down the bottom that QR code encodes the unique identity of a receipt so that you can scan it with the Walmart app or any app for that matter and it will download that receipt on to your phone associate it with your account so that you can now do returns you'll never lose that receipt again in the future we hope to go purely paperless with this as well as well as that there are some other teams within Walmart who can use this information provide other benefits to the customer savings catcher is basically getting money back if it was cheaper at other stores insta watch is getting a digital copy of a movie that you purchased in a store and the one hour guarantee is the largest sale in the world and all of these kind of partners inside the company use our data as well so within the data center where it starts to get more interesting so we have approaching 5,000 Walmart stores I think now we get data from all of those stores every time a receipt is printed that transaction sent up to our data centers where a whole bunch of stuff happens a whole bunch of side effects these could be saving it to the database it could be forwarding on to other partners within Walmart it could be sending push notifications to customers it's essentially just a one-way pipe of data coming in and stuff happening so of course in the internals we use closure it's essentially a one-way directed graph of data we have durable queues in between all of our consumers and producers so that because when you're operating in a Walmart data center stuff goes up and down and goes missing and everything that Zack was just talking about is very true especially at this kind of scale so all the edges we have keys if Apple is down which I hope it wouldn't be but if it does then we actually have a queue that kind of blocks you know protects us from that bit of realm and vice-versa for all these other side effects we have a lot of other closure as well we have closure to do REST API s2 called access we have closure to do deployments we have closure to monitoring its closure all the way down we use a lot of closure all in all this is about 20 services that we run it's about 50 thousand lines of code it's about 70 Leiningen projects these are all ways to get a sense of roughly how big it is it's I don't think it's the biggest closure system out there but it's a little bit bigger than normal and we run this with about eight developers and have been doing it for about three years and all started from a java application back in a startup and then we've rewritten it all to be big in glory yeah so basically the stuff I want to talk about today is all the things that we run into as we grew the team and grow our code base how do you structure code how do you deal with multiple projects like learning and projects how do you evaluate what project what libraries to use inside your applications and bit of a framework for thinking about that and also just embracing the Java ecosystem we build on the shoulder of giants Java is painful to write but there are a lot of awesome technologies there and it's easy when you come in from a non Java background kind of like be scared of all that Java stuff but when you reach these kinds of scales then Java becomes your friend and if I get time every time I give this talk it goes a different length but if I get time I'll go and talk that infrastructure as code as well and some of our ideas around that so yeah structuring code there are a lot as I mentioned there are lot of tutorials online you know little quick namespaces showing you how to write applications almost all of them have def connection start connection and then functions that kind of like reference that global State one thing with us now we started over time we've tried to eliminate as much global state as possible and a lot of this is driven by Stewart Sierra's talks on closure in the large and component and basically the recommendation I can give that has worked really really well for us is embrace component if component is a library it's a tiny framework that helps you manage runtime state and gives you a place to put it so that you can instantiate it you know where it is and it's not just kind of some global var sitting off in the distance so I'm gonna show a little bit of code about examples of how we actually use component before I do that who is already using component in some way okay so I'll walk through a little bit of how component works in and what it is so this is a component this is a Cassandra component that we have it is essentially holds a database connection and configuration on how to create that connection so it's just a record in this case a component can actually just be a map it's basically just some place to put data and so on so we have hosts which is just going to be a vector of IP addresses where the Cassandra connection is we have a key space which kind of tells us which key space we want to access inside Cassandra and when we first instantiate this record we don't give it a connection the connection is something that we want to be put into this record after it started so how does that happen so there is this protocol called life cycle in component and component provides two methods start and stop you call call start and we're in closure it's immutable so this is going to return a new version of this record the old version still exists but after we call start we're going to it'll go and do all this work to create a Cassandra connection and it'll return that with by just as searching the connection back on to this record and so that's what you get this isn't some kind of immutable thing that you call start on it mutates the state inside and then you just keep the same reference every time you call start you need to grab the result of that and component gives you ways to do that very very easily so within here we have we say when we call start if the connection is on this record that means it's already started and we want to do item put in operations here so we are going to just return immediately but it's going to send back these same record as we started with if it hasn't been started if that connection object is nil then we are going to call this start connection thing that could be whatever in the case of Cassandra probably just returns a connection object so we're going to search that object back onto the record and return the record first first so for stopping we have if the connection is present it means that it has been started so we do want to stop it so we're going to call stop connection which does something stops the connection and then associate is connection back on is nil so that we know that this is not started anymore it's stopped and vice-versa if it if the connection was nil that means already stopped so we can just return the record as is and for each of these components we usually just have a helpful constructor in our applications there is a config map it's like basically an Eden file that sits on the machine and there's a loaded when the application starts and this config contains all the configuration like IP addresses usernames passwords all that kind configuration we throw that into the constructor with a key space and then this constructor is responsible for pulling out the information from that configuration map that it cares about in this case we're going to assume that config is a map it has a key Cassandra and under that it has a hosts map or whatever it is and so we use that to call the record constructor and passing that information that we need so this is what a component looks like for us and this is there are different types of components if anyone seen Stewart Sierra's component talk he talks about three different types of components stateful service and I think it's a composite or something like that this is an example of a stateful component it there is some state state could be a database connection a thread pool a Atem like anything that is inherently not in not immutable even the core async channel is state in some way so that's what this does this is encapsulating that state somewhere that we can think about and reason about and instantiate another type of component is push so this is a service component it kind of encapsulate some service over here which does something in this case it's an HTTP SMS service that we might use to go and send an SMS to a customer and by the way most of this code is actually really close to how it looks in our code base and like in a production code so in this case you'll notice that we have a record this is our component it has a little bit of configuration a host and a port that we want to go and send information to and we have naively decided to use a blocking HTTP client which doesn't give us any kind of access that connection or how to stop and start it but that's fine in this case there is no lifecycle so we're just going to say a record is constructed it doesn't need to be started or stopped it's just a record and what component will do is it will say all right well if there is no lifecycle created for this particular record then we're just going to return the record there there's nothing to stop and start and that's fine so in this case we just have the record and it's just implementing a protocol which is just going to send an SMS to a customer given a phone number and a bit of text to send them and again we just have a constructor function just to make our lives a little bit easier we pass in a configuration map it's going to pull out the SMS configuration from that map and use that to construct the record so that's that kind of component that's a service component the last component that I'll mention is a composite component and this is you can think of it like a business logic place you know there we've got our kind of underlying components that deal with state service providers all these kind of raw operations that we want to do a composite component is something that where you can actually put in the business logic so this one again it's a record with no start and stop because there's no state we care about we're going to give it a little bit configuration some text that we would want to send in the SMS and then we're going to give it some components of its own to use in this case the Cassandra component because we need to persist receipts and SMS because we need to send messages to customers here's our constructor this one's a little bit different we've got our usual calling the default construct that sorry the constructor for the actual record up here inside the uploader in that config map there's going to be that SMS text we presume but also you'll notice that we're telling we're using a component system to say you're going to need a Cassandra component and you're going to need an SMS component we're not going to tell you what they are right now like you don't have to construct them before you pass them into this continue uploader constructor we're just going to assume that there is some component that has been created which you can reference by this name here receipt Cassandra and there is also going to be some component called SMS so when you find those when component tells you what they are and this is just classic dependency injection just slot amine into Cassandra and SMS up here in the record so finally we end up being able to use these components in this case we have an upload receipt function basically when receipt is uploaded into our system we need to go and do some stuff and that stuff is going to be inserting the receipt to Cassandra and sending an SMS to the customer so the important thing here is that we've passed in the upload a component into this function usually what you'd see is something a bit like this you have got old upload receipt because this is actually what we used to do we would call this upload receipt function we would pass it the receipt and then we would just assume I shouldn't go to that meeting I thought I'd turn a felt look I did yeah we use that look at one month so yeah you're going to insert a receipt to Cassandra and you going to send SMS but you'll notice that we haven't told it what that Cassandra object is where the connection is you know this is kind of very typical code that you'll see around the closure community where it is assumed that there is a connection object in some namespace that has been started but we don't know and so if we were to just call this function as is like who knows what might have happened that connection object might have been stopped there might be some kind of race condition that it goes there because now it's like a single bit of state that you know everyone's trying to use so there's that there's a lot of problems with this and while the code looks nicer you know that there's less like you know D structuring of kind of various other sub components and so on it it does hired a lot of complexity that's going on here so this is how we write it we pass in the uploader we pull out the information from that uploader that we care about which in this case is the SMS text configuration the Cassandra object the Cassandra component and the SMS component and so therefore now the Cassandra function just takes a Cassandra component and does something with it and vice versa with that send SMS this doesn't need to know about the internals of Cassandra or SMS you know it's just going to pass it through the bit that cares but the important thing is that we have access to all these and there is zero global state right now everything is within lexical scope within this function so we find that a huge win so let's go and one step further how do we construct this system here is our constructor for our whole system ours is about ten times bigger than this because we have a lot of components but essentially what we're doing here is we are going to say constructor system assign these names to the various components so that we can reference them in other places and then go and construct them so we're going to construct the Cassandra I'm going to pass in the configuration map which we're just pulling into this here and we're going to say this particular Cassandra connection is for the receipt database we're also going to construct another Cassandra component this one's for the customer database and you can see here we've already kind of gone further than what you could do just using a symbol a single global var for the Cassandra connection you can't instantiate that connection multiple times here you can because it's it's just in this record so just create multiple versions of the record so for us when we have multiple databases and and this is something which you know I think is code bases get bigger and bigger you run into this is this is where the avenge comes in another thing to note here is that we don't need to care about ordering component we'll figure out the order of dependencies when they need to be started and stopped so all we need to do is say like you know we could swap all these around it could put you know Redis down here you know there's nothing that you know that we need to worry about here the actual here's where it actually we tell component how things are ordered we say this customer API component depends on Cassandra customer and store Redis so therefore component will do the job of constructing those first calling start first so that everything has started stopped in the right order so this is this is our constructor system we've got services here the receipt service might bind to port 8080 that that would be kind of its start operation and then push would usually just be like starting connections and doing various configuration so as I mentioned hours are a lot bigger than this but this is how you can start a system and I can show you what that actually looks like so I've got a little so I've got a bit of configuration here this is pretty much what a lot of our stuff looks like you have some SMS configuration including the host and port you have the SMS text we referred to area strewth you've got a receipt we have some Cassandra hosts that we're going to care about this is what our configuration map looks like when we pass it into our application and then if we want to start the system or create a system we can call the so this is this new system function which I mentioned earlier so let's create that and see what it creates nothing oh yeah I need to borrow some configuration ok so here you can see we've created a receipt Cassandra record it has hosts the key space and a nil connection and that's because we haven't started the system yet we have a customer Cassandra again with a nil connection we have our uploader here which importantly doesn't have Cassandra or SMS yet because we haven't started the system so this is what the system looks like when you construct it and remember it's just a map there's nothing special going on here yet so let's take that it's call that system and now let's start it and I'm gonna pre print this okay so this time we can see that receipt Cassandra has a connection object it's here we can see that the customer Cassandra one has a different yep a different Cassandra object the uploader oops yes so the uploader now has the text and it has a reference to the other standard object and remember we're talking about a big persistent map here so where it looks like it's all copied around that this is actually just referencing the same actual bit of data but again this connection object is the one which was created earlier so now we've kind of filled in all the blanks everything's been starting the right order we've got our because our connection objects created all of our can our business logic components that relied on Cassandra and and everything they're all filled in now like this system is ready to go and we can start using it and just to hammer this in home the system that we created originally is still the same thing it the uploader oh no wait I didn't write the original system that we created hasn't been mutated it's still in a mutable record so this makes reasoning about your system very easy so this is kind of a great way to structure your applications that we've found it really helps in testing as well we do a lot of side effect testing we will say run upload receipt and make sure that an SMS was sent to a customer well we really shouldn't send an SMS to a customer we're running a test that's not a good idea and this is a pretty common occurrence that people run into and usually the traditional way you do this is you would mock you would with read if you would kind of take the function that you were going to run and now you just say instead of running it just do nothing mocking libraries can get pretty powerful but if you can restrict yourself to the core closure ecosystem like the call closure libraries you're going to find yourself in a much more flexible area so what we do is we had our push protocol here SMS and obviously here we're going to send a post to some SMS provider they're going to do the worker sending the SMS to the customer so what we can do is we can instead of creating that record we can create our own record that still emits SMS but simply puts that result onto a channel and we've kind of experimented with this a little bit you can put it into an atom you can use various other kind of synchronization mechanisms the great thing about putting this information into a channel is that you can block somewhere lower down the test so that you can immediately know when a particular operation occurred as opposed to when you perform a side effect and you just put that information into an item or something like that you've now got a pole at the end of the test or you know do a thread sleep for about 10 milliseconds while you wait for that particular side effect to occur because by definition like everything we do here is there's cues involved there's side effects like everything is occurring in concurrent ways so this is how we do it we will say when the Sen SMS operation occurs just go and put that information in channel and you're done so here we have our same constructor for it for this particular component and when we actually create the test this is the full test that we're going to create we're going to have some SMS text that we want to send the customer we create our configuration map we create a channel upon which those SMS information will be put eventually as part of the side effect we're going to construct the SMS component using that SMS channel and again this would have been HTTP if we'd been using our usual system that I showed you earlier and now we're going to construct the uploader so this is we've seen this one before it's going to take the SMS text and bubble bar and expect that there's a Cassandra and SMS component handed to it and because components are just Maps we can do that we can just say construct the system because we construct this uploader don't start it yet but just associate the SMS on to that particular uploader so what is telling it just use this one now so continue on we're going to create a phone number we're going to create a receipt which is just a map for now and we're going to call upload receipt now we're going it we want to test here that the SMS is sent so now we can just sit here blocking waiting for that particular channel event to occur and then get the value out and then BOOM we can just say here is phone number and text and is it correct one little challenge here is that if that side effect doesn't occur then that's obviously going to block forever so you usually want to put a little bit of kind of you know sugar around this to say timeout after a second and if the timeout occurs that means the test has failed but that's all stuff you can add pretty easily so I want to stress that this is the whole test there is no global state still everything is here so if you wanted to run these tests in parallel you can do that now because you're constructing completely separate systems if you're doing things with ports you know you can just have in incrementer so that every single time a system is constructed uses a different port now you can like blast off ten of these tests in parallel you could very easily use this to generate property based testing because this looks an awful lot like properties for all so all of this stuff becomes very easy you don't have to worry about state stepping on each other or different threads you know stepping on each other everything's here and you can instantiate it which we find pretty cool the other the final thing they'll say about this is for I showed you the big system constructor before with all the different services again we have something like 50 components or something like that a lot of them are side effect upon your running tests you don't want to perform those side effects so what we end up doing a lot of is creating a mock system which just has the components that we don't want stuff to happen to so in this case SMS and Apple and Google push so now the next time we run the test because a system is just a map-like component is really nice like this it's 200 something lines of code most of which is comments like it's really small it's just kind of some dust around the edges of maps and records so here we're just going to construct a system which returns a map record we're going to merge in the mock components just over the top of it so we're saying for those particular components just overwrite them we don't care about them and now we can call start so here this is what a lot of our code looks like we have a full system like this is all the production code that we want to test just with the very edges mocked out and we can run upload receipt passing in that in entire system and test like 99% of our code all the way through and so yeah this is this is what a lot of our tests look like so yeah I would highly recommend using component if you're not already and you expect your codebase to scale in any way even if you don't expect your code to scale component is just a really nice way to reason about your code and it also answers a few questions about when you put code which is the other thing we struggled with like the namespace is not a class a namespace is you know just some place to put stuff so one pattern we've been using a lot of is you've got a component for Cassandra it has like insert put all the kind of like you know local functions to it so just put all that in a namespace and coloca Sandra and vice-versa for SMS like this is a really great way to separate your application and it also means that eventually if you do want to split your application into multiple projects then you kind of components become a natural way to separate that because it's state and States a great way to separate things so that is right so let's go back to multiple projects so this is something which a lot of that we get a lot of you create a project it's one Lanigan project we've looked at brood a little bit we haven't looked enough to kind of talk about it much so I'll just talk about line but yeah you create a project and you want to split it out and when you do you kind of hit this problem of all right let's let's assume that we have system we're using HTTP let's assume that's one of our own libraries and we want to make some changes to it so we're going to go into it in cider and let's just go in how to println ah we can't this is buffer is read-only and the reason is is that it's actually in your local maven repository so you obviously can't edit it because it's inside a jar file um who actually has hit this like yeah it gets really annoying and this is a side effect of having a maven repository and Java which closure is based upon so if you have multiple projects and you depend on one project down here and that means that when you try and navigate into that particular project and make some code changes it's actually referencing your local maven repository it's not referencing your local source code I won't go on too much about this I will say that it's been a pretty annoying issue for us we don't really have any answers on how to best deal with multiple lining and projects the only thing we can say is that we're in a better position to deal with multiple projects in the closure community because a project clj for instance is just data it's just this you know thing that you can pull into memory you can parse you can pull out all the dependency lists you can pull out the source directories and can just kind of merge it together to create some kind of zombie Frankenstein project and that's something that we did which I'm just going to show you what it looks like this is a generated project clj you know we've got some kind of name for it we've got a version but importantly we've got all the dependencies from all of the projects in our well that's a lot when you look at it in a small buffer yeah and like we have all the sauce directories from all of our projects or 70 of them in here all the resource paths so you obviously won't want to do like update this every time you add a new project so we just generate this and I won't show the code for it like it's it's a it's a good exercise and something to do but you just like traverse the whole directory structure pull out all the products your J's merging together somehow then you can do stuff like run all the tests in your system within the one rep within the one session you can have access to all of your code in the one repple session like it's it's kind of a I don't know it's a total hack but uh you know we can we can hack it this is this is pretty like accessible to us as closure developers because it's just data that we can work with try doing this in Java or something else and it's pretty difficult right I've heard I was going to lead up to this but choosing libraries is something that we also struggle with a lot how do you choose what library to use when you're reaching kind of some kind of scaled application there's a few ways we think about this but the fundamental thing is just be extremely cynical and yeah rich Hickey said this best not everything is awesome it's you know you've got there's so many different closure libraries out there but when they're there you've got to ask a few questions you've got to ask does this force me into a particular style of programming like an idiom you know is it full of macros and if it is how is that going to affect how we like build on this library in the future is it going to mean that we're just stacking on code left right and center like simply to you know when we could have just cut it out and like toned ourselves so in general if a library does one thing really well if it's small if it's just input/output it doesn't have any state doesn't have any network connections like pausing CSV you probably don't want to implement that yourself like you know how about it go and go and grab a lottery for it if it forces unit was some kind of idiom be really sure that that's what you want to do core async is a pretty good example of a library that does something which we just couldn't do so we're quite happy to use it it forces into a style of programming but it solves a really cool problem the that one to be really worried about though is any library which does any network operation so we've been hit pretty hard by this we've you know there's let's take an example like I know accessing Dropbox or something like that you know if there's some closure library which maybe does this and you can just kind of call it give it a data structure and maybe it just creates a file on a Dropbox server somewhere if it doesn't give you timeout guarantees if it doesn't tell you how the queuing works behind the scenes if if it's using blocking i/o instead of non-blocking i/o these are all questions that you need to ask pretty early on because the second your server starts to scale into any you know reasonable number of requests per second these things can can hit you pretty hard we look at example we had we were using HBase for a while and didn't do our research on the library that we were using and so turned out that when we hit you know a thousand transactions a second which happens on Black Friday and thankfully this one didn't happen on Black Friday basically the network slowed down you know requests we're only going at about a hundred bytes a second or so and so when you've got you know all these transactions coming in and then it slows down it turns out that our timeouts weren't working as we thought they were we had specified a timeout of one second I think but it turns out that there were like four other timeout parameters that we had no idea about and it wasn't your typical connection versus you know socket kind of timeout there was like a retry timeout so if it did timeout then it would retry itself but it wouldn't show that to us as the end user so it turns out that through a weird amalgamation of stuff you get 20 minutes of default timeout on this library and this is something that we found out the hard way and it and it was blocking so if you can imagine thousand requests a second sixty seconds you've got sixty thousand threads open yeah so do your due diligence on libraries libraries are awesome especially at the early startup stage like you know just how about it it's great but the second you think that you're going to be scaling then just get I mean just be cynical about libraries it helps a lot logging so who who is like understands Java looking well actually no I'm looking there are there's a lot of wedding stuff out there there's log back there's love for Jay there's Java logging thankfully there is some kind of facade right in the middle SL foj don't be too afraid of it if you are getting to scale then you're going to start to want to debug a lot of the java libraries that underlie your system and and most like a lot of the libraries that we use which we take the ground enclosure like clj HTTP it's using a patchy HTTP you know like most of the stuff that we're using is still underlying is Java if you don't understand how this works then you're going to be able to bug to bug this in production so yeah like understand it it's worth it it gives you a lot of power you know we're doing stuff now like we log you know just using a seller for jay logging and then we we configure a Pender's using log back zero so we might append in JSON to a file so the log stash can put it up to a server we might also append by UDP to a ream on sober you know there's lots of like this is once you're using a logging framework these are all things you can configure at runtime very very easily and I've got a few like little code samples around this so it's just finished at ten forty or ten fifty no fourteen okay cool I have five minutes so actually I'll skip this I will say I've got all this code up on github I've got a link to it at the end of it so you can have a look at it but here's like some basic stuff that you should add here your application to ensure that all logging from Java util logging from SOF AJ from log back it all goes through the same facade otherwise you're going to yourself having a log4j xml i'll log back XML a java.util properties file so it's really nice to kind of put them into the one bit of configuration and we have that inside a project clj as well so check this out afterwards if you're interested in this kind of looking stuff another quick thing that we do is demonising so when you are deploying an application usually end up with an uber jar honest on a system somewhere and there's quite a few ways of deploying it i've actually running that Ruby jar the way that we do it which will kind of like is we use Commons daemon it's say Apache project that has an interface that you implement with four methods in it start stop destroy pretty pretty basic stuff it's not like Tomcat or anything like that it does just one thing very very simply so you implement this and it turns out that this works really well with component because in in it you can construct your system using some kind of config file coming in from the file system and then you can store that and this is the one bit of global state we have we have no global state in our system except for this one system reference and the only reason we have that is because Commons daemon calls in at first and assumes that you mutate some global state to create the system so that afterwards it can start that system so here's our system which is no when we start and then we initialize it and then we call start and all we're doing here is we're just altering that root bar and calling components start on the entire system which will then go through every component in the right order start everything up so it's all ready to go and then is also stop and destroy and so your access to this is via the program JCC which is just a binary running on the system you can tell it what user to run as you can tell it what out file to use you know Java configuration and then finally you just give it the class where that interface is implemented and started so yeah Commons is cool I think yeah well I had a lot more stuff to talk about but I'll have to cover short I guess the one other thing I do want to mention is that infrastructure is code gets a lot more important as you grow when you've got one application one over charter deploy you can SCP it to a server and run it but when you've got 20 services running on hundreds of nodes with queues and databases and load balancers and monitoring and logging and all this stuff you want to be able to spin up a new service like that you want to be able to say alright I'm gonna pull these components together and create a new Bazaar out of them and then that's a new service that I'm going to provide to my customer if you're not automating the creation of the log back XML of the load balancer configuration of how many machines to deploy it onto if you're doing that via hand you're going to be scared of creating new services it's gonna be like a time consuming thing full of you know errors automate as much as that as possible as early as possible and it'll make your lives a lot easier so as an example like here's the stuff that we have to do is part of configuring a new service and we automate all of it so that we can and we actually deploy all of this from a ripple and if you want to hear how we do that you can come and find me afterwards because I'm not going to recommend the technology in public so anyway yeah thanks listening here is the code samples I would highly recommend reading release it it's a book by Mike Nygaard it'll scare the crap out of you when you're thinking about creating big systems because it tells you all the bad things that can happen but you should read it watch component by Stewart Sierra it's a great it goes deeper on what component is language of the system by Richie key was behind a lot of our design of our overall architecture and if you do want to learn more about infrastructure as code and DevOps which I highly recommend you do the Phoenix project is a great place to get started thank you